TRANSACTIONS 


OF THE 


AMERICAN MATHEMATICAL SOCIETY 


EDITED BY 


WILLIAM C. GRAUSTEIN 
EINAR HILLE 
C. C. MACDUFFEE 


WITH THE COOPERATION OF 


A. A. ALBERT ERIC T. BELL JESSE DOUGLAS 

T. H. HILDEBRANDT E. P. LANE R. E. LANGER 

MARSTON MORSE OYSTEIN ORE H. L. RIETZ 

H. P. ROBERTSON M. H. STONE J. L. SYNGE 

GABRIEL SZEGO G. T. WHYBURN OSCAR ZARISKI 
VOLUME 43 


JANUARY TO JUNE, 1938 


PUBLISHED BY THE SOCIETY 
MENASHA, WIS., AND NEW YORK 
1938 


BOSTON UNIVERSITY 
COLLEGE OF LIBERAL ARTS 


RRARS 


| 
ia 
il 


Composed, Printed and Bound by 
The Collequte Preve 


George Banta Publishing Company 
Menasha, Wisconsin 


- 


TABLE OF CONTENTS 
VOLUME 43, JANUARY TO JUNE, 1938 


AGNEW, R. P., of Ithaca, N. Y. Comparison of products of methods of 
AuLFors, L. V., of Cambridge, Mass. An extension of Schwarz’s lemma. . 
ALBERT, A. A., of Chicago, Ill. Symmetric and alternate matrices in an 
BERNSTEIN, B. A., of Berkeley, Calif. Postulates for abelian groups and 
fields in terms of non-associative operations....................... 
Birxuorr, G., of Cambridge, Mass. Analytical groups................. 
Buss, G. A., of Chicago, Ill. Normality and abnormality in the calculus 
Caruitz, L., of Durham, N. C. A class of polynomials................. 
De Cicco, J., of New York, N. Y. The geometry of whirl series......... 
FRIEDMAN, B., of Cambridge, Mass. Analyticity of equilibrium figures of 
Gorg, G. D., of Chicago, Ill. Transformations of a surface bearing a family 
Iver, V. G., of Madras, S. India. A correction to the paper “On effective 
sets of points in relation to integral functions”..................... 
James, R. D., of Berkeley, Calif. A problem in additive number theory. . 
LeHMER, D. H., of Bethlehem, Pa. On the series for the partition function 
LEvINSON, N., of Princeton, N. J. On the growth of analytic functions. . . 
Lewy, H., of Berkeley, Calif. On differential geometry in the large, I. 
Generalized integrals and differential equations................. 
LuBBEN, R. G., of Austin, Texas. Concerning limiting sets in abstract 
MacLaneE, S., of Ithaca, N. Y. The Schénemann-Eisenstein irreducibility 
Mioray, A. N., of Philadelphia, Pa. Decompositions and dimension of 
Morrey, C. B., of Berkeley, Calif. On the solutions of quasi-linear elliptic 
Ore, O., of New Haven, Conn. On functions with bounded derivatives. . 
Poritsky, H., of Schenectady, N. Y. Generalizations of the Gauss law of 
RANDELS, W. C., of Evanston, Ill. On an approximate functional equation 
SINGER, J., of Brooklyn, N. Y. A theorem in finite projective geometry 
and some applications to number theory.......................... 
Wipper, D. V., of Cambridge, Mass. The Stieltjes transform........... 


327 
359 


Qh 
} 
386 
1 
61 
365 
167 
344 + 
183 
303 | 
494 
296 | 
271 i] 
240 | 
258 | 
437 
482 i 
226 
465 
126 
321 
199 
102 
377 
7 


| 
| 


POSTULATES FOR ABELIAN GROUPS AND FIELDS 
IN TERMS OF NON-ASSOCIATIVE OPERATIONS* 


BY 
B. A. BERNSTEIN 


1. Object. The object of this paper is to present sets of postulates for 
abelian groups and fields in terms of the non-associative (and non-commuta- 
tive) operations “—” and “/”, the inverses of “+” and “xX” in a field. The 
postulates will thus treat directly the properties of the inverse operations in 
a field, properties important from the standpoint of operations in general, but 
perhaps not sufficiently emphasized in the usual treatment of groups and 
fields. Three sets of postulates will be given for fields. In each set, the postu- 
lates free from “/”, taken by themselves, will form a set of postulates for 
abelian groups. Unlike other sets of postulates for fields known to me, the 
sets offered contain no (unconditioned) existence proposition other than one 
demanding that the class contain at least two elements. The consistency, 
necessariness, and sufficiency of the postulates are established by the usual 
methods. The postulates will be found to be simple and “natural”. 

2. Postulates (F) for fields. A class K of elements will be a (non-trivial) 
field with respect to a pair of binary operations “—”, “/” if K, —, / satisfy 
the postulates N, S:-S;, D:-D, following: 


N. K contains at least two distinct elements. 

Si. a=a-—(b-—Db), 
if a, b, and the combinations indicated are in K. 

Se. a—(b-—c) =c—(b-<a), 
if a, b, c, and the combinations indicated are in K. 

Ss. If a, b are in K then a—b is in K. 

Di. a = a/(b/b), 
if a, b, and the combinations indicated are in K. 


De. a/(b/c) = c/(b/a), 


if a, b, c, and the combinations indicated are in K. 


* Presented to the Society, April 11, 1936; received by the editors February 23, 1937. 

At the time of the reading of the paper no other postulate set for groups or for fields in terms of 
“indirect” operations was known to me, except Wiener’s postulates for fields in terms of “@” (Pub- 
lications of the Massachusetts Institute of Technology, June, 1920). Since then, there have come to 
my attention papers on groups treated from my general point of view by R. Baer and F. Levi 
(Zentralblatt, vol. 4, p. 338), Morgan Ward (these Transactions, vol. 32, p. 526), G. Y. Rainich, 
D. G. Rabinow. Rainich’s paper and Rabinow’s are, I understand, in press. I have modified the origi- 
nal draft of my paper so as to put in the background possible duplications of the results of these 
writers. 


1 


| 
4 
5 
i 


BOSTON 
COLLEGE OF LIBERAL ARTS 
LIBRARY 


fi 
| 
| 
| 
. 


2 B. A. BERNSTEIN 


[January 


D;. If a, b, b—b are in K, and if b¥b—b, then a/b is in K. 

D,. If a, b, b—b are in K, and if b=b—b, then a/b is not in K. 

D;. If a, b, and the combinations indicated are in K, and if a/b=a/b—a/b, 
then a=a—a. 

Dz. (a — b)/c = a/c — b/c, 
if a, b, c, and the combinations indicated are in K. 


3. Consistency, necessariness, and independence of Postulates (F). The 
following arithmetic system (K, —, /) satisfies all the Postulates (F): 


K =0,1;a@— 6 = a+) (mod 2); a/b = a + (mod 2). 


Hence, Postulates (F) are consistent. 

The properties given by Postulates (F) are seen to be properties of “sub- 
traction” and “division” in a field. Hence, Postulates (F) are necessary for a 
field. 

Finally, Postulates (F) are mutually independent. The independence-sys- 
tems are given by the table below. The systems in this table are all arithmetic. 
Some of the systems are modular, the modulus being enclosed in parentheses 
following the operations. A system contradicting a postulate P is lettered P. 
The independence-table follows. 


INDEPENDENCE-SYSTEMS FOR PosTULATES (F) 


System K a—b a/b 

N Null class 

Si 0,1 0 a+b (2) 

Se 0,1 a 0+0 

Ss 0,1 0+0 a+b (2) 

Di 0,1,2,3,4 a+4b = (5) ab+(0+6) (5) 

De 0,1,2 a+2b (3) a+(0+d) 

Ds 0,1 a+b (2) 0+0 

Ds 0,1 a+b (2) a+[0+(ab+a+1)] (2) 

Ds 0,1 a+b (2) 0+(0+5) 

De 0, 1,2 a+2b (3) 
+(0+6) (3) 


4. Theorems. The theorems T,-T2 following are derivable from Postu- 
lates (F). They will establish the sufficiency of Postulates (F) for fields. 

Ti. a=b—(b—a). 

a—b=(c—c) —(b—a). 

Ts. (a—b) —c=(a—c)—b. 

a—a=b—b. 


DEFINITION 1. 0=a—a. 


1938] POSTULATES FOR ABELIAN GROUPS 3 


DEFINITION 2. a’=0—a. 
DEFINITION 3. a+b=a—b’. 


Ts. If a, b are in K then a+b is in K. 

Ts. a+b=b-+a. 

a+(b+c) =(a+b) 

Ts. For any two elements a, b in K there is an x in K, namely, x=b—a, such 
that a+x=b. 


In Theorems Ts-Ty and in Definitions 4, 5, 6 following, unless told 
otherwise, it is supposed that the elements indicated are all in K, i.e. (Ds, D,), 
it is supposed that no “divisor” is 0. In the proofs of the theorems, “Hy- 
pothesis” will refer to this supposition. 


Ty. If a=0 then a/b=0, and conversely. 
Tio. a=b/(b/a). 

. a/b=(c/c)/(b/a). 

Ti. (a/b)/c=(a/c)/b. 

Tis. a/a = 


DEFINITION 4. 1=a/a. 
DEFINITION 5. a:=1/a. 
DEFINITION 6. ab=a/b,(b+0); a0=0. 


Tu. 10. 

Tis. ay ~0. 

Tie. (a/b), = b/a. 

Tir. 0a =0. 

Tis. If a, b are in K then ab is in K. 

ab=ba. 

Teo. a(bc) =(ab)c. 

Tx. For any two elements a0, b 0, there is an x, namely x =b/a, such that 
ax=b. 

Tx. a(b+c) =ab+ac. 


5. Proofs of the theorems. The proofs of Theorems T,-T2 follow. In 
Ts-Tis, if a0, 60, c¥0, then any combination of a, b, c is in K, by S3, Ds. 
This fact will generally be understood in the proofs. 


Proof of T;. a=a—(b—b) =b—(b—a), by Si, So. 

Proof of T,. a—b=a— [b—(c—c) ]=(c—c) —(b—a), by Si, Se. 

Proof of T;. (a—b) —c=(c—c) — [c—(a—b) ]=(c—c) — [b—(a—c) ] 
= (a—c) —b, by To, Se, Te. 

Proof of T;. a—a=b— [b—(a—a) ]=b—8, by Ty, Si. 


PME 


} 
ae 
| 
| 


4 B. A. BERNSTEIN [January 


Proof of T;. By Definition 3, Definition 2, Definition 1, S;. 

Proof of T;. a+b=a—b’ =a—(0—b) =b—(0—a) =b—a’ =b+a, by Defi- 
nition 3, Definition 2, Sz, Definition 2, Definition 3. 

Proof of T;. a+(b+c) =(b+c)+a=(b—c’) —a’ =(b—a’) —c’ =(b+a) +c 
=(a+6)+c, by Ts, Definition 3, T;, Definition 3, T;. 

Proof of Ts. a+(b—a) =a—(b—a)’=a— [0—(b—a)]=a-— [(c—c) 
—(b—a)|=a—(a—b) =), by Definition 3, Definition 2, Definition 1, T2, Ti. 

Proof of Ty. If a=0, then a/b=(b—6)/b=b/b—b/b=0, by Definition 1, 
D,, Definition 1. If a/b=0, then a/b=a/b—a/b, by Definition 1; hence 
a=a—a=0, by Ds, Definition 1. 

Proof of Tio. a#0, b/a #0, by Hypothesis. Hence }#0, by Ts. Hence, 
a=a/(b/b) =b/(b/a), by Di, De. 

Proof of Ti. c¥0, by Hypothesis. Hence, a/b=a/[b/(c/c) ] 
(c/c)/(b/a), by D,, D2. 

Proof of Ti. by Hypothesis. If a0, then (a/b)/c 
= (c/c)/ |¢/(a/b) (c/c)/ [b/(a/c) ]=(a/c)/b, by Tu, De, Tu. If a=0, then 
(a/b) /c=0=(a/c)/b, by To, To. 

Proof of T;;. a#0, by Hypothesis. Hence a/a=b/[b/(a/a)]=b/b, 
by Tio, Dy. 

Proof of a#0, by Hypothesis. Hence, 1 =a/a+0, by Definition 4, To». 

Proof of T;;. a#0, by Hypothesis, Hence, a:=1/a0, by Defi- 
nition 5, Ts. 

Proof of a#0, by Hypothesis. Hence, (a/b):=1/(a/6) 
= (b/b)/(a/b) =b/a, by Definition 5, Definition 4, Tu. 

Proof of Ti;. If a#0, then 0a =0/a,=0, by Definition 6, Ts. If a=0, then 
0a=0, by Definition 6. 

Proof of Tis. If 6#0, then 6:0, by Tis; hence, if b#0, then ab=a/b, is 
in K, by Definition 6, D;. If b=0, then ab is in K, by Definition 6. 

Proof of Ti. If a#0, 6+0, then ab=a/b,;=a/(1/b) =b/(1/a) =b/a,:=ba, 
by Definition 6, Definition 5, De, Definition 5, Definition 6. If a=0, or b=0, 
then ab =0=ba, by Tiz, Definition 6. 

Proof of Tx. If a¥0, b¥0, c¥0, then a(bc) = (bc)a=(b/c1)/a1=(b/a1)/e1 
=(ba)c=(ab)c, by Tis, Definition 6, T::, Definition 6, Ti. If a=0, or b=0, 
or c=0, then a(bc) =0=(ab)c, by Tiz, Definition 6. 

Proof of Tx. a(b/a) =a/(b/a):=a/(a/b) =b, by Definition 6, Tis, Tio. 

Proof of Tx. We first prove the Lemma ab’ = (ab)’. The Lemma is true be- 
cause, if a0, then 
= (b/a:)’ = (ba)’ = (ab)’, by Tis, Definition 6, Definition 2, Ds, Ts, Definition 2, 
Definition 6, Ti9; if a=0, then ab’ =0 =(ab)’, by 
Definition 1, Ti7, Definition 2, hypothesis. If now a0, then a(b+c) =(b+c)a 


1938] POSTULATES FOR ABELIAN GROUPS 5 


= ab —ac’ = ab—(ac)’ =ab+ac, 
by Tis, Definition 3, Definition 6, D., Definition 6, Ti,, Lemma, Definition 3; 
if a=0, then a(b+c) =0 =0—(0—0) =0—0’ =0+0 =06+0c =ab+<ac, by Ti:, 
Si, Definition 2, Definition 3, T:;, hypothesis. 

6. Sufficiency of Postulates (F) for fields. From propositions N, T;-Ts, 
Tis-T22 we see that Postulates (F) are sufficient for a field.* 

7. Postulates (F’), (F’’) for fields. Following are two other sets of postu- 
lates for fields: 


(F’) N, Ti, Se, Ss, Di-De. 
(F’’) N, Ti, Ts, Ss, Di-De. 


In T; is understood the condition if a, b, and the combinations indicated are 
in K. In T; is understood the condition if a, b, c, and the combinations indi- 
cated are in K. 

It is clear that the sets (F’) and (F’’) are each consistent and necessary 
for a field. 

The postulates in each of the sets (F’) and (F’’) are independent. Inde- 
pendence-systems for (F’) are the same as those for (F), given by the table 
of §3, except that systems §;, S2 are replaced respectively by (a), (8) follow- 
ing: 
(a) K=0,1;a—b=0; a/b=a+(0+0). 

(8) K=0,1;a—b=b; a/b=0+0. 

Independence-systems for (F’’) are the same as those for (F), except that 
systems Si, S2 are replaced respectively by (y), (6) following: 

(y) K=0,1;a—b=a;a/b=0+0. 

(6) K=0,1;a—b=b;a/b=0+0. 

The proof of the sufficiency of (F’), (F’’) for fields is left to the reader. 

8. Postulates for abelian groups. Corresponding to (F), (F’), (F’’) we 
have respectively (A), (A’), (A’’) following as postulate sets for (non-trivial) 
abelian groups: 

(A) N, Si, Se, Ss. 

(A’) N, Ti, Se, Ss. 

(A’’) N, Ti, Ts, Ss- 

Clearly, (A), (A’), (A’’) are each consistent and necessary for abelian 
groups. 

Independence-systems for (A), (A’), (A’’) respectively are the independ- 
ence-systems for the corresponding fields confined to K, —. 


* Compare E. V. Huntington, Definitions of a field by sets of independent postulates, these Trans- 
actions, vol. 4 (1903), p. 33. 


| 

Re 

4 

at 

i 

ua 

| 


6 B. A. BERNSTEIN 


The sufficiency of (A) for abelian groups follows from N, T;-Ts.* The 
proof of the sufficiency of (A’) and of (A’’) is left to the reader. 


* Compare E. V. Huntington, Two definitions of an abelian group by sets of independent 
postulates, these Transactions, vol. 4 (1903), p. 27. 

t Since the above was written, two papers by David G. Rabinow have appeared: Independent 
sets of postulates for abelian groups and fields in terms of the inverse operations, American Journal of 
Mathematics, vol. 59 (1937), pp. 211-224; Note on the definition of fields by independent postulates in 
terms of the inverse operations, ibid., pp. 385-392. In the first of these papers the author recapitulates 
his results in an earlier paper, entitled Independent set of postulates for a group in terms of the inverse 
operation, offered to the Bulletin of the Society about May 6, 1936. This is the paper of Rabinow’s 
to which I referred in my first footnote. The results in this paper differ little from those in Ward’s 
paper (cited above). As far as I know, the paper has not yet been published. Rabinow’s two Journal 
papers are closely related to mine. (In the first paper, the postulates for abelian groups are precisely 
my set (A’’), without my Postulate N and without the restrictions on the elements in my Ty, T3.) 
In neither paper is there reference to Ward’s paper or to mine (Abstract 42-5-133, Bulletin of the 
American Mathematical Society). 


THE UNIVERSITY OF CALIFORNIA, 
BERKELEY, CALIF. 


THE STIELTJES TRANSFORM* 


BY 
D. V. WIDDER 


Introduction. The Stieltjes transform is defined by the equation 


(1) = iim 
= = 1m 
Ie o dg 
We assume that a(é) is of bounded variation in (0, R) for every positive R 


and that the limit (1) exists. If a(#) is the integral of a function ¢(#), we ob- 
tain the special case 


2 = —— dt. 
(2) f(x) Pre 

In this paper we discuss two distinct but related questions, the inversion 
problem and the solubility problem. In the former we assume that f(x) is a 
function which admits the representation (1) or (2), and seek to determine 
a(t) or @(¢) from f(x). In the latter we seek necessary and sufficient conditions 
on f(x) that it should have the representation (1) or (2). 

A solution of the inversion problem was given by Stieltjes} himself by 
means of contour integration. His result was 

t oe 1 —ty 
f(s)ds, 


2 _t-iy 


where the symbol R means “real part of.” 

It is our purpose to obtain a real inversion formula, one depending only 
on a knowledge of f(x) and its derivatives on the positive real axis. One may 
conjecture the existence of such a formula by noting that the Stieltjes trans- 
form is the product of two Laplace transforms. That is, 


But a Laplace transform admits of two types of inversion, one by contour 
integration and one by use of the successive derivatives of f(x) on the posi- 
tive real axis.t As we showed in the paper cited, these two inversions are 
* Presented to the Society, December 31, 1936; received by the editors March 1, 1937. 
Collected works of T. J. Stieltjes, vol. 2, p. 473. 


t D. V. Widder, The inversion of the Laplace integral and the related moment problem, these Trans- 
actions, vol. 36 (1934), p. 107. 


7 


# 

“uy 

i 

gi 

| 

i 


8 D. V. WIDDER [January 


analogous to the two classical determinations of the coefficients of a power 
series, one by Cauchy’s integral, the other by Taylor’s series. 

A real inversion of the integral (2) has been found recently by Paley and 
Wiener* in case $(é) and [¢(é) |? are integrable in the interval (0, «). The 
result is 


1 m — 1)” d\2n 
(3) 


me (2n)! dt 


The formula obtained in the present paper seems, at first sight, totally un- 
related, but we shall show later that this is not the case. It is 


ki(k — 2)! dt*- 


This is a linear differential operator of order 2k—1. With no restrictions on 
¢(t) beyond those necessary to make (2) converge we show that 


lim Lit [f(x) = 


for almost all positive ¢. Further, we prove that 


a(t +) + a(t —) 
2 


— a(0 +) 


lim |du = 


for all positive ?¢. 

The method employed is the same as that used earlier by the author for 
the Laplace transform. That is, one employs a known method for discussing 
the asymptotic behavior of an integral of the form 


fle 


as k becomes infinite. In the present case 
t 


a function which has a single maximum at t=x. We observe that the funda- 
mental solutions of the linear differential expression L,,,[f(x) | are 


(n= —k, 


Hence we may say that the Stieltjes transform is inverted by the linear differ- 


g(t) = 


* R.E.A.C. Paley and N. Wiener, Fourier transforms in the complex domain, American Mathe- 
matical Society Colloquium Publications, vol. 19, 1934. 


1938] THE STIELTJES TRANSFORM 9 


ential operator of infinite order which has all integral powers of x as its funda- 
mental solutions. 

We solve the solubility problem by obtaining necessary and sufficient con- 
ditions that f(x) should have the representation (1) or (2) with a(é) or ¢() 
belonging to certain familiar classes. The most important classes considered 
are a(t) of bounded variation or non-decreasing, ¢(¢) of class L?;(p21) or 
bounded. For example, we prove that f(x) has the form (1) with a(#) non- 
decreasing if and only if 


f(x) 20, 1)*[x*f(x)]@*-Y 2 0 (x >0;k =1,2,---), 
f(~) = 0. 
This is the analogue of Bernstein’s theorem on completely monotonic func- 
tions. The case of the integral (2) with ¢(#) belonging to L* was treated by 
Paley and Wiener in the Colloquium lectures cited above. 


A formula for computing the saltus of a(/) at a point of discontinuity is 
also obtained. In fact we show that 


\i/2 | 
lim 2¢ (=) Lx,+[f(x) a(t +) a(t —) (t > 0). 


too 
An inversion formula for the generalized Stieltjes transform 
da(t) 


aon (p > 0) 


f(x) = 


is also obtained. 
The final section of the paper shows the relation between the Paley- 
Wiener operator and L;,,,[f(x) ]. The former can be written symbolically as 


(cos wD) 
where 
d 
={t—- 
dt 


For, the finite series (3) is clearly a section of the infinite power series for 
cos 7D. We show that L;,.[f(x) ] is essentially a section of the familiar infinite 
product expansion of the cosine, so that symbolically the operators are 
equivalent. It should be observed that the Paley-Wiener operator is ap- 
plicable only to the case in which ¢(¢) belongs to L? (at least in so far as 
proofs have yet been given), whereas the operator of the present paper is. 
not so restricted. 


i 
j 
i 
| 
i 
4 
4 
4 
; 


10 D. V. WIDDER [January 


1. General properties. Let a(#) be a complex function of the real variable ¢ 
of bounded variation in the interval 0<‘<R for every positive R. Such a 
function is said to be normalized if 
a(0) = 0, 
a(t ++) + a(t —) 


a(t) = (t > 0). 


We assume throughout that a(t) has these properties. It is clear that the in- 


tegral 
f da(t) 
o § t 


exists for every complex s=o+i7 not on the negative real axis, r=0, o SO, 
which we shall henceforth denote by D. Set 


se) = f f s+t 


whenever the indicated limit exists. Then the improper integral (1.1) is said 
to converge. We show at once that the region of convergence of (1.1) is the 
whole complex plane less the ray D. 

THEOREM 1.1. Jf the integral (1.1) converges for a point sy not on D, it con- 
verges for every such point. 


For, set 
* da(u) 
Sot u 


B(0) = 0, -f (t> 0). 


Then for s not on D 


st+t “Si s+ a(R) of (s +H 


Since 6(R) approaches a limit as R becomes infinite, it is clear that the in- 
tegral 

o (s+#)? 


converges absolutely and that (1.1) converges. Moreover, 


da(t) da(t) B(t) 


1938] THE STIELTJES TRANSFORM 11 


We observe that (1.1) may converge without converging absolutely as the 
example. 


fi) = 


shows. But (1.2) always enables us to replace the given integral by an abso- 
lutely convergent one. 

At this point we note a contrast with the theory of the Laplace transform. 
In the latter case a direct integration by parts replaces a conditionally con- 
vergent integral by an absolutely convergent one. That this is not always 
true for the Stieltjes transform is seen by an example. Take 


0) =0, 
t+2 


a(t) = ( G4 (n< n+1;n ) 


Then 


c= dt 
o (+ 2)? 2) log (¢ + 2) 


clearly diverges. But 


da(t) a(t) log ( + 3) 


the series converging. For this example, integration by parts replaces a con- 
ditionally convergent integral by another with the same property. 


Corotiary 1.11. If (1.1) converges, it converges uniformly in any bounded 
closed region not containing a point of D. 


Coroiiary 1.12. If (1.1) converges, f(s) is analytic at points not on D. 
1.13. If (1.1) converges, 


f(s) = (— (k=0,1,--- 


Another very useful result is contained in 
THEOREM 1.2. If (1.1) converges, then 
(1.3) a(t) = o(t) (¢— ~). 


Let (1.1) converge for s=s not on D, and define A(t) as in Theorem 1.1. 
Then 


a4 d 
tad 
Se 
| 
4 


12 D. V. WIDDER [January 


R R 
a(R) = da(t) = f 


But 
R R 
f 1dB(t) = RB(R) — f B(t)dt, 
0 0 
. 1 
tim 1dB(t) = — = 0. 
Also 


1 R 
tim — f sodB(t) = 0, 
im J. od B(t) 


so that (1.3) is established. 


Coroiary 1.21. Jf (1.1) converges, then 


a(t) 
Ks) 


It is important to note that the converse of Theorem 1.2 is false. Thus 
(1.3) holds if 


du 


Yet for this definition of a(#) the integral (1.1) diverges. 
However, it is easily seen that if 
a(t) = O(t'-*) (6>0,t—> 
then (1.1) converges. 
The relation between the Laplace and Stieltjes transforms is made pre- 
cise in 
THEOREM 1.3. If the integral (1.1) converges, then 


(1.4) ss) =f eve 0), 
where* 
(1.5) o(t) -f e~“da(u) (¢>0). 


* The notation employed in (1.4) means that ¢(é)dt=limeso g(0)dt. 


. 


1938] THE STIELTJES TRANSFORM 13 


For, by Theorem 1.2 a(u)=o0(u) when u becomes infinite, so that (1.5) 
converges uniformly* in the closed interval e<t<R for arbitrary positive 
numbers e and R. Hence 


R bed bed (stule — g—(stu)R 
f e~*'dt f e~“da(u) = f da(u). 
0 0 stu 


If s is any point not on the ray D, the integral 


e-eu 
da(u 
clearly converges, so that 


(1.6) f € e~“da(u) = a(u) —e 


The first integral on the right-hand side converges uniformly (s being fixed) 
in the interval 0<S¢< ©, and hence approaches f(s) as € approaches zero. 
Moreover,t 


lim da(u) = lim = , 
Roe SHU SHU Ss 


so that the last term of (1.6) approaches zero with 1/R if ¢>0. Our result is 
consequently established by allowing ¢ to approach zero and R to become in- 
finite. 

Note that the inequality o>0 in (1.4) cannot be replaced by 0 as the 
example f(s) =1/s shows. We observe also that the converse of Theorem 1.3 
is not true. That is, the integrals (1.4) and (1.5) may converge in the range 
specified without having f(s) represented in the form (1.1). For example, take 


se) = f > 0). 
The integral (1.1) becomes 
- (n + 1) 


* See footnote on p. 12. 

+ For the results regarding the Laplace transform which are here employed see, for example, 
D. V. Widder, A generalization of Dirichlet’s series and of Laplace’s integrals by means of a Stieltjes 
integral, these Transactions, vol. 31 (1929), p. 694. 


| 
| 


14 D. V. WIDDER [January 


a series which diverges for all s. In this connection we may prove 


THEOREM 1.4. If a(u) is such that the integral (1.5) converges for t=0, then 
the function f(s) defined by (1.4) also has the representation (1.1) for all s not 
on D. 


For, in this case, a(u) is necessarily bounded and we may apply Theorem 
3.2. 
A similar result is contained in 


THEOREM 1.5. If a(u) is non-decreasing and such that (1.4) and (1.5) con- 
verge for a >0, t>0 respectively, then the function f(s) defined by (1.4) has the 
representation (1.1) for all s not on D. 


The proof is easily supplied. We turn next to the uniqueness theorem. 


THEOREM 1.6. If the normalized function a(t) is such that 


(1 >0,n = 0,1, 2,---) 
0 Sotitnl 
then 

a(t) = 0 (OSt< oa), 


This follows from Theorem 1.3 and from Lerch’s uniqueness theorem for 
Laplace integrals.* 
We conclude this section with a proof of 


THEOREM 1.7. If a(t) is a real non-decreasing function for which the point 
t=0 is a point of increase and for which the integral 


da(t) 

converges, then f(s) has a singularity at s=0. 
For, suppose the contrary. Then the series 


f(s) = poy 


n=0 
converges for some point on the negative real axis, say s= —e, and 
(e 1 n 
n=0 
Applying Corollary 1.13 we obtain 


* The usual proof of this theorem can easily be extended to include Cauchy-values of Laplace 
integrals: 


1938] THE STIELTJES TRANSFORM 15 


(e+ 1)" 
f(—e«) = —— da(t 
= 1 a(Z)dt. 
This series dominates the se: ies 
(e+ 1)" 
1.7 1 a()dt, 
(1.7) (n+ 


so that the latter also converges. Since the integrand is non-negative, and 
since the series 


(e+ 1)" 1 
1 = 


converges for />e, we may interchange integral and summation symbols in 
(1.7). That is, the integral 


a(t) 
€ (t e)? 


must converge. But since ¢=0 is a point of increase for a(é), it follows that 
a(e+) >0 and 


(1.8) 


Hence (1.8) can not converge. The assumption that f(s) is analytic at s=0 
is untenable. That is, s=0 is a singularity of f(s). 

2. Inversion in a special case. The results of the previous section enable 
us to restrict attention to the real variable s =x. In fact we shall even assume 
that a(t) is a real function. The loss of generality thus involved is trivial. The 
reader who has need of results for complex functions a(t) has only to apply 
the theorems proved to the real and imaginary parts of a(#) separately. 

We introduce a functional operator by 


DEFINITION 2.1. An operator Li,.[ f(x) ] is defined by the equations 


Lo, = cof), 
Li lf(x)] = x(— = 1, 2, 3,--+), 
where 
1 


(k = 2,3,4,+++), 


o=a=i, Ck 


a(t) 
lm ——=+ 
toe+ — € 


16 D. V. WIDDER [January 


Obviously the operator can be applied only to functions which possess 
derivatives of order (2k—1). It becomes of interest only when & is allowed 
to become infinite, so that we shall be applying it only to functions which 
possess derivatives of all orders. Our first application will be to the function 
(1.1) where a(t) is a step-function defined as follows: 


0 (0<t<a), 
a(t) = { 
1 (a<t<o), 
a(a) = 3. 
In this case 
da(t) 1 
f(s) -f ati x+a 
Simple computation gives 
t*—1q* 
tL f(x) | ( 9 “9 ) 


d;, = (2k 1) 
We now prove 


THEOREM 2.1. If a>0, ¢>0, then 


0 (0<t<a), 

lim Law| | au = (¢ =a), 
0 x a 

1 (1 <t< o). 


That is, the operator L;,.[f(x)] serves to invert the integral (1.1) at least 


in this special case. Set 
t u*-1 
H,(t) =d f ————— du 


é 1 t 
0 x-+a a 


so that we need prove only 


Then 


(° (0<t< 1), 
lim H,(t) = <3 (¢ = 1), 
(1<t<o). 


If 0<t<1 we have 


t k-1 
(2.1) 0S Hilt) <a& + 


1938] THE STIELTJES TRANSFORM 17 


¢ 
lim —_— = < 
k(k 2) (¢ + 1)? (¢ + 1)? 


it follows that the extreme right-hand member of (2.1), and hence also H(t), 
tends to zero with 1/k. 
If 1<t< «, we have by use of the B-function 


= du. 


But 


t k-1 
O<d f ———du<d | 
‘J, ~ 2° 


so that in this case 


lim = 1. 
Finally, if ¢=1, 
k—-1 k—1 1 
H,(1) = & du = - dy du 
1 1)% o (w+ 1)? 
H,(1) 
by an obvious change of variable. Hence 
2 2 


This completes the proof of the theorem. 

3. The inversion of the general Stieltjes integral. Before proceeding to 
the general case we need to prove the following simple, but extremely useful 
lemma: 


Lemma 3.11. If f(x) has a derivative of order (2k—1), then 
[ xk f(x) = (x) ](), 
The proof consists merely in computing both sides of the equation by 
Leibniz’s rule. In each case we obtain 
(2k — 1)!k! 


~ (2k — p— 1)!pl(k — 


Since 
i 


18 D. V. WIDDER {January 


We shall also need 
Lemma 3.12. If (1.1) converges, then 


lim d (k = 2,3 
For, if € is a given positive number, we determine 5(e) such that 
| a(t) — a(0 +)| <e O0StK< de). 
Then 
aust f [a(t) — a(0 +)] a| f 
G3 t) 0 (x + t)?* 0 (x + t)?* 
tk-1 
d —— — a(0 dt |, 
+] dist 20 4)] 
tk-lla(t) — a(0 k—1 
lim sup f dt | e— 
20+ 0 (x + k 


the second term on the right-hand side of (3.1) clearly approaching zero with 
x. Hence 

lim d t)dt = d,a(0 — dt = a(0 
im d,x a(t) +)x a(O +) 


z—0+ 
In a similar way we can prove 
Lema 3.13. If a( ©) exists, then 
(k = 2,3,4 
= = 
By use of these results we can now establish 
THEOREM 3.1. If the integral (1.1) converges, then 
t 
lim Liul f(x) = a(t) — a(0 +) (t> 0). 
0 
We begin by writing the integral (1.1) in the form 
a(t) 


j(x) = 


o (x+#)? 
which we are enabled to do by Corollary 1.21. By Lemma 3.11 we have 


f Liu[f(x) |du = (4) , 


1938] THE STIELTJES TRANSFORM 19 


‘But simple computation gives 


y*la(y) 
o (u+ y)** 


(-— 1) (4) = d,u* 


Hence by Lemma 3.12 


k—1 
Liul f(x) = dy — ; a(0 +). 


Consequently, it remains only to show that : 


k-1 ei, 
(3.2) lim d,t* = a(t) (¢> 0). 


o (+ 
Set y/t=v. The integral in question becomes 
yk-la(tv) 


k —— do. 
o (1 + 


a(t —) (0<v< 1), 
¥(v) = ja (v = 1), 


a(t +) (1<v< o), 


Further, set 


Then* 
a. f = [a(t +) + —)]A;(1). 
Hence by Theorem 2.1 


lim af - = a(t). 


Set 
B.(v) = a(tv) — 


This function is continuous at ¢=1 and vanishes there. It remains only to 
show that 


 gk-1 
lim d f —— £6,(v)dv = 0. 
Given e>0, we determine 6(¢) such that 
* For the definition of H;(é) see §2. 


} 

-. 


20 D. V. WIDDER [January 
| Biv) | |v —1| 
Then 


< O(1)H.(1 — 6) + + 6) — —8)} 


2 
(3.3) o (1+ 


In obtaining the last term on the right-hand side of (3.3), we have used the 
obvious fact that 


B.(v) = (v— 


Letting k become infinite in (3.3) and making use of Theorem 2.1, we have 


bed 
af (+0) B.(v)dv 


from which our result follows at once. 


lim sup 


Coro.iary 3.11. If (1.1) converges and if a(t) is continuous in (a, b), then 


lim | Leulf(x)]du = a(t) — a(0 +) 


uniformly in the interval a<t<B, where 
a<a<B<b (a > 0), 
asa<B<b (a = 0). 


To prove this one has only to show that 


i a —a 
lim J | (y) — a() Jay 


= lim [a(ty) a(t) |dy = 0 


uniformly in (a, 8). If we note that 
a(ty) — a(t) = o(1) (y= 1) 
uniformly in ¢ for ¢ in (a, 8), the proof of this proceeds as for Theorem 3.1. 


Coro.iary 3.12. If exists, then 


Liulf(x)]du = — a(0 +). 


1938] THE STIELTJES TRANSFORM 21 


3.13. If 
a(t) = O(t?) (t—» «) 
for some positive value of p, then equation (3.2) holds. 


4. The Lebesgue integral. In this section we consider the special case of 
(1.1) in which 


$(u)du, 


the function ¢(u) being integrable in (0, R) for every positive R and of such 
a nature that (1.1) converges. We then have 


_ 790 
(4.1) f(x) ‘ 


We now show that the operator Z;.,,[f(x) | serves to invert this integral also. 


THEOREM 4.1. If ¢ is a point of the Lebesgue set for the function o(u), and 
if (4.1) converges, then 


(4.2) lim = 4. 
By the Lebesgue set we mean those points ¢ for which 
h 
au = o(% (h0). 
0 
Direct computation gives 
L = dt f ¢(u)du. 
tL f(x) k o(u)du 
We have to show that 
u* 
lim f — $(t)]d 


uk 
«mat — $(t)]du = 0. 
lim f — 60) = 0 
Set 


B(y) = "[o(ut) — o(t) 


Since / is a point of the Lebesgue set, we have 
(4.3) A(y) = o(| 1 (y— 1). 


| 


22 D. V. WIDDER [January 


Then the integral in question becomes 


= af (u + 1) dB(u) = 


If we note that 


148 (4 — (u — 
(uw + 19244 + 1)24+ 


and take account of (4.3), we have, by a method similar to that used in ob- 


taining (3.2), 


k—2 


IA 


lim sup | Z;_| 


Il 
—) 


lim 


This proves the theorem. 
Coro.iary 4.11. Equation (4.2) holds for points t at which $(t) is continu- 
ous. 
CoroLiary 4.12. Equation (4.2) holds almost everywhere. 
Corotiary 4.13. If (4.1) converges and if p(t) is continuous in (a, b), equa- 
tion (4.2) holds uniformly in a<t<B where 


a<a<B<b. 
Corotiary 4.14. If o(t+) and o(t—) exist, then 


ot +) + ot -) | 
2 


tim La = 
For, set 
a(t, u) = [o(yu) — 4(u) Jay, 


and note that 
B(t, u) = o(| 1 (¢— 1) 


uniformly in a<u<8. The proof is now completed by obvious modification 


of the proof of Theorem 4.1. 
At this point we illustrate Theorem 4.1 by an example. Take 


1938] THE STIELTJES TRANSFORM 23 


= (0 <6 <1). 
Then 


f(a) = 
Simple computation gives 
T(k —5 + 1)P(k +6 — 1) 


= 
T(k + 1)T(k — 1) — 
But 
T'(k + a) 
(a>0,k— 0), 


so that we can prove directly that 


o(t) 


for all positive ¢. 
5. The saltus operator. We now introduce a new operator by the 


Derinition 5.1. The operator l;.,.[ f(x) | is defined by the equation 
1/2 
Le = 2¢ (=) 


We first apply the operator to the special function 


_ _ +) - at -) 
fe) =f cto x+1 


where y,(v) was defined in §3. Direct computation gives 


T 1/2 u*® 
liulf(x)] = 2 (=) [a(t +) — a(t —)]. 


Then by use of Stirling’s formula, or otherwise, we prove 


1 0, u ~ 1, 
pore a+1 u=1. 
Hence, the limit of 1;,,[f(x)] is the saltus of ¥,(u) at every point u. This 
result is general, as we now prove. 


Lemma 5.11. 


é 
| 


D. V. WIDDER [January 


THEOREM 5.1. Jf (1.1) converges, then 
lim 1y,.[f(«)] = a(t +) — a(t —) (t > 0). 


If we define B(v) =8,(v) as in §3 and note that 
1/2 bed u® 
le elf(x)] = 2¢* (=) af da(u) 


T 1/2 yk 
2(= a f 
(=) 


the special example treated above shows us that we need only prove 


the function 6(v) being continuous and equal to zero at v=1. If we integrate 
by parts, the integral in question becomes 


Now note that 


oa 1/2 1+6 d vk T 1/2 ld vk 
k (v-+1)% k o dv (v+1)* 
@d vk dy 1 
1 dv (v + k x+1 


Hence, proceeding as in §3, we obtain 


1 1 
= 2el 
In| < + | 


+40 »2(=) "a im d u* 4 
— — du 


Consequently 


u* 
2(=) as f u|— ——— | du 
k 143 | du (u + 


24 
lim sup | 2e, 

lim I; = 0, 

too 
since 


1938] THE STIELTJES TRANSFORM 25 


1 uk 
= (1+ d du = o(1) 


This completes the proof of the theorem. 
The same type of argument enables us to prove the following related re- 
sult. 


THEOREM 5.2. If (1.1) converges and if a(t) has right-hand and left-hand 
derivatives a,! (t) and a_ (t) respectively at a point t, then 
a,’ (t) + a 
2 


lim f(x) |] = 


For, if we define 


~ 1)ax,! (0) 


= a(vt) — w(r), 
it is clear that 
(5.1) y(v) = o(| 1 — ) 
and that* 
o (vu + 2 


so that we have now to show that the integral 


approaches zero with 1/k. This may be done by use of (5.1) if we note that 


1+6 vk 


Coroiiary 5.21. If a(t) is constant in (a, b), then 


* This may be conveniently proved by breaking the interval into two parts corresponding to the 
intervals (0, 1) and (1,0) and by using Corollary 4.14. 


| 
i 
| 

i, 

1sv<om, 
1)? 


D. V. WIDDER [January 
lim = 0 
ko 


uniformly for t in (a, B), 
a<a<B<b (a > 0), 


asa<B<b (a = 0). 


Theorem 5.2 becomes of particular interest if a(t) is a function which is 
not an integral but which has a derivative almost everywhere. The integral 
(1.1) can not then be put in the form (4.1). Yet the inversion formula has 
the same effect as if f(x) had the form (4.1) with a’(#) =¢(é). 

6. Generalizations. We turn now to a group of theorems which may be 
regarded as generalizations of our inversion formulas. 


THEOREM 6.1. If a(t) is of bounded variation in the interval (0, ~), and if 
V(t) is an arbitrary function continuous* in the interval 0 St < ~, then 


Since we have already established that 
t 
lim f f(x) = a(t) — a(0 +), 
k>@ 0 


we have only to take the limit under the integral sign on the left-hand side 
of (6.1) to obtain our result. This will be permissible by the Helly-Bray theo- 
rem? if the functions 


are of uniformly bounded variation in (0, ©) for k=1, 2, 3,---. This is so 
under our hypotheses, since 


THEOREM 6.2. If (1.1) converges, and if W(t) is continuous in (0, R), then 


R R 


* By this we mean that ¢(¢) is continuous for every non-negative value of ¢ and that ¢(¢) ap- 
proaches a limit as ¢ becomes infinite. 

Tt See, for example, G. C. Evans, The Logarithmic Potential, Discontinuous Dirichlet and Neumann 
Problems, American Mathematical Society Colloquium Publications, vol. 6, 1927, p. 15. The infinite 
intervals in question may be transformed into finite intervals by the transformation »=e™. 


1938] THE STIELTJES TRANSFORM 27 


Here, we are no longer assuming that a(t) is of bounded variation in 
(0, ©). Let 6 be an arbitrary positive constant, and set 


f(x) -f +f = fle) + fal). 


Clearly, Theorem 6.1 is applicable to fi(x). Let Ye(t) be continuous in the 
infinite interval (0<#< ~), coinciding with y/(¢) in (0, R) and constant in 
(R, Then 


R 
= tim fv + tim YR) 
0 R 


R R+6 
= f veda + f — +). 
0 ' R 
Making use of Corollary 3.12, we have 


lim ]dt = lim Y(R) Le,e[fr(x) jdt 
R R 


= ¥(R) [a(t + R) — a(R)]. 
On the other hand 


R 
f V(t)Le,t[ fo(x) |dt = 0, 


as one sees by Corollary 5.21. If we combine these results, our theorem is 
proved. 


THEOREM 6.3. If a(t) has variation V(R) in the interval 0O<i<R, and if 
(1.1) converges, then 


R 
(6.2) tim at = VCR) VO 4). 


It is sufficient to prove the theorem when a(0+) =a(0) = V(0+) =0. 


If we set 
dV(t) 
= 


it is clear that 


| LeeLf(x)]| S Le.e[g(x)]. 


| 


28 D. V. WIDDER [January 
Hence Theorem 3.1 gives 
R 
(6.3) lim sup f | Lee[f(x)]| dt < V(R). 
0 


On the other hand, from Theorem 6.2 we have for any function ¥/(é) continu- 
ous in (0, R) 


R R 
f ¥(t)da(t) | < max | ¥(¢)| lim inf f | LieLf(x)] | de. 
0 O<t<R 0 
Hence the norm of the linear functional 
R 
f 
0 
which is known to be V(R), is at most 


R 
lim inf | Le,eLf(x) ] | dt. 
0 
That is, 


(6.4) V(R) S lim inf | Lee [f(x)] | de. 


Inequalities (6.3) and (6.4) can not both hold unless (6.2) is true. 
6.31. If < then 


lim | dt = — V(O+). 
0 


7. Differentiation and integration of the inversion operator. Without any 
restrictions on the integral (1.1) except that it should converge we are able 
to obtain inversion formulas for the successive integrals of a(#). The following 
result is seen to be a generalization of Theorem 3.1. 


THEOREM 7.1. Jf (1.1) converges, and if m is any positive integer, then 


{™ 


The result follows at once from Theorem 6.1 with 


(¢ — 
m! 


¥(u) = 


| 


1938] THE STIELTJES TRANSFORM 29 


The successive derivatives of a(u) or ¢(u), when they exist, may be ob- 
tained in several ways. We first prove 


THEOREM 7.2. If $(t) is of class C™ in the interval 0St< ~, if 


(n)(f 
(7.1) O 
0 at+t 


converges, and if f(x) is defined by (4.1), then 
lim 1)"f(x)] = 
ko 


Since (7.1) converges, we have 


t 
f o™(u)du = o(t) 
0 
Hence 


so that integration by parts gives us 


o0 «tt x x? x3 


(0) f 


(n — +s (x + 


(— 
j=1 x? 
The result now follows by use of Corollary 4.11. 
Corotiary 7.21. If a(t) is of class C” in (0, ©), if the integral 


bed (n) 
f a” (#) 
0 x + t 


converges, and if f(x) is defined by (1.1), then 
lim ] = (A) 


A more natural procedure is given in 
THEOREM 7.3. Under the conditions of Theorem 7.2 


n 


d 
lim Lit[f(x)] = 
dt” 


| 


30 D. V. WIDDER [January 


We have at once 
ot” (t+ u)?* ou" (t+ u)?* 
so that 


———— d 
ou” (¢+ u)?* 


Conditions (7.2) now enable us to integrate by parts and obtain 


d” 
|= (= f $(u) 


tk-1 


for values of k sufficiently large. But the asymptotic behavior of this integral 
as k becomes infinite was determined in §4. It clearly approaches the desired 
value. 


u, 


Coro.iary 7.31. Under the conditions of Corollary 7.21 


tim Lil f(x)] = a™(2). 


8. A generalized Stieltjes transform. The results of the preceding section 
suggest a way of inverting the general Stieltjes transform 
da(t) 
o (x +t) 
where p is any positive number. We first prove 

Lemma 8.11. If ¢>0, x>0, p>0, then 

| | aw =d 


— u)* 
0 L (u + (x + 


F(x) = 


If O<u<i, 
(x — u)e-} + | 


(— 


(x — 


n+p\ (n+k)! (— 
—k+ 1)! 


- o( 


n=k—1 


n 


If x<é integration term by term is permissible, so that 


0 T'(p) L + 


THE STIELTJES TRANSFORM 


> (" (n+ k)! (—1)"*1 anton! 
(n—k+1)! T(n+p +1) 
+ 2k — 1\ — 
= (2k — 1)! 
k—pyktp—1 
(x + 
It can now be seen by analytic continuation that the formula holds for all 
positive x and 
Lemma 8.12. If the integral 


n 


= (2k — 1)! 


da(t) 


o («+ 
converges, then 


(8.1) a(t) = o(t?) 


The proof is similar to that of Theorem 1.2. 
By use of these results we now prove 
THEOREM 8.1. Jf 0<p<1, and if the integral 
da(t) 
(8.2) F(x) = ’ 
o (x +t)" 


converges, then 


(8.3) lim — u) Lu [F(x) ]du = alt. 


Using Lemma 8.12 we have 


0 (x + 


t* 
Li. [F(x)] = a(u)(— | (t+ =a 


F(x) =p 


By uniform convergence we have for 0<6<y/2 
T'(p) 


Lx,1[F(x) |dt 


1938] 31 

(p > 0) 
| 
| 
| 
— | 
Then 


32 D. V. WIDDER [January 


We may now replace 6 by zero on both sides of this equation. To justify this 
step it is sufficient to show that 


y ad g2k-1 t* 
— mat f | | d 


converges. For this it is sufficient to show that the integrals 


y {2k-p-1 
(8.4) (y | | (p = 0, 1, 


all converge. But 


O(1) (u — 0), 
a(u) = 
O(u?) (u— o), 
so that 
{2k-p-1 
J Gi (t+ 0; p =0,1,---,&). 


This proves the convergence of the integrals (8.4) and hence that 


0 


di. 
f a(2) du 
(y + u)?* 


If we use the known asymptotic expression of this last integral (k—), our 
theorem is established. 
We illustrate the theorem by an example. Take 


= 1 (¢> 0), 
a(0) = 0, 
1 
F(x) = —- 
xP 
Then 
LeulF(2)] = (k — p)(k — p — 1) (1 — p)p(o + 1) (o + ) i 
ki(k — 2)! uP 
‘ _ T+ k +1) 
(t — ]du = DI (¢> 0), 


1938] THE STIELTJES TRANSFORM 33 


and the right-hand side clearly approaches unity as k becomes infinite. This 
example shows in particular that the restriction p<1 in Theorem 8.1 was 
essential, for if p>1, the left-hand side of (8.3) need not converge. 

If p is not less than unity we must proceed differently. In order to treat 
this case we introduce a new operator. 


DEFINITION 8.1. An operator Li,.[f(x)| is defined by the equation 


2k— 


In this definition p is any positive number, & is a positive integer greater 
than p. The notation is understood to mean 


(k) 


{ f(t) } -f 


if p is not an integer. If p is an integer 
= (= 
It must not be supposed that for fractional p the function {f(#)}-» is 
the fractional derivative of f(t) of order k—p, defined for 0<p<1 by 
dt* J, T(p) 


For, this integral need not exist in the present case. For example, if f(#) =¢-"”?, 
the integral does not exist if p=1/2. Yet the operator L}/? [f(x) | clearly exists 
for all integers & not less than unity. 

We prove next 


THEOREM 8.2. If the integral (8.2) converges, then 
t 

lim | Li.u[F(x)]du = a(t) — a(0 +) (t >0). 
0 


We first prove that for & sufficiently large 


(k — 1)! dal?) 
T'(p) 


(8.5) {f(x)}— = 


To see this we have 
I(o+k) dat) 
T(p) Yo 
alt) 


(= 1) F(x) = 
(8.6) 


D. V. WIDDER [January 


J F(t)dt 


Tip t+ k+1) ¢* x) 


If 6>0, R>0, «+6<R, the uniform convergence of the integral (8.6) shows 
us that 


— 
4s 


one k 
lin (— 1) 


k 1) a(u) auf (¢ — x)e-! 


T(o + k + 1) au) (¢ — 
0 T() z 


+ 1) au), 
Yo («+ 


(8.7) = (— 


provided the integral (8.7) converges absolutely. This it clearly does for k >p 
by virtue of the relation (8.1). An integration by parts now gives (8.5). Then 
we obtain 


t bed k—-ly —1 
f Liu[F (x) = dit" f +) 


precisely as in the proof of Theorem 3.1. The theorem is now established by 
use of Corollary 3.13. 
In a similar way we may obtain a generalization of Theorem 4.1. 


THEOREM 8.3. If $(#) is integrable in (0, R) for every positive R, and if 
F(x) is defined by the convergent integral 


o(t) 
F(x) = J >0), 


lim = od) 
at every point t of the Lebesgue set for (u). 


Let us illustrate these theorems by use of the same example as we used 
for illustration of Theorem 8.1. 


34 
then 


THE STIELTJES TRANSFORM 


F(x) = xP -f (x + -f (x + sett -f (x + dt 


Direct application of Definition 8.1 gives us 


Li. [F(x)] = 0, 


p+1 


Lit [F(x) ] =P; 


p+2 k 1 
[F(x)] = + 1) t. 


k—2 
In each case the appropriate limit process gives the result predicted by Theo- 
rems 8.2 and 8.3. 


9. Uniqueness. Of fundamental importance in later work will be the 
uniqueness theorem for the operator L;,,. As a preliminary result we establish 


THEOREM 9.1. Let f(t), g(t) be functions of class C?* in 0<t< @, and let 
(9.1) lim = (¢0,t> p = 1, 2,--- 
(9.2) lim = (t0,t> p = 1,2,--- 


Then 


f = f (4) ]( f(t) dt, 


if either integral exists. 


To verify this one has only to integrate successively by parts. Conditions 
(9.1) and (9.2) guarantee that at each stage the integrated part vanishes. 


THEOREM 9.2. If f(x) is of class C?*— in the interval 0<x< ©, and if 


(9.3) lim (4) ]@-»)(¢ + a)-? = 0 
(t- 0, t> p= i, ,k;a>0), 


(9.4) lim [#2*-1(¢ + a)—*-1] (4-9-2) f(t) = 0 
(t0,t> ~;p = 0, ,» &=— 2), 


= O}) 1), 
f® = Or”) »<k+1), 


1938] 35 
then 


36 D. V. WIDDER [January 


This result may be obtained at once from Theorem 9.1. For in that theo- 
rem replace g(t) by (¢+a)~! and f(t) by 


fil) -{ f(ujdu. 


We obtain 


(9.6) f dt = (— f fildt. 


Integrating the right-hand member by parts gives 


af 


1 
=f ——] 


It remains only to verify that the integral (9.7) converges. It clearly does by 
virtue of (9.5). 
We come now to the uniqueness theorem. 


THEOREM 9.3. If f(x) is of class C* in (0<x%< ~~), if (9.3), (9.4) hold for 
k=2,3,4,---, and if (9.5) holds for some positive u and v which are independ- 
ent of k, then 


(9.8) fle) = tim 


dt (a> 0). 
k- 0 t+ a 


The proof follows at once by use of (3.2). 
Note that if fi(x) and f2(x) are two functions satisfying the condition of 
Theorem 9.3 and such that 


Leelfa(x)] = Lee fo(x)] 
for all sufficiently large integers k, then 
filx) = fo(x) (QO<x< 


It is for this reason that the result may be regarded as a uniqueness theorem 
for the operator L,.[f(x) ]. 

10. Sufficient conditions for uniqueness. Conditions (9.3), (9.4), and (9.5) 
are sufficient for the application of Theorem 9.3. In this form, however, it 
may be difficult to determine, in any given case, whether a function satisfies 


1938] THE STIELTJES TRANSFORM 37 


them or not. It is the purpose of the present section to replace them by con- 
ditions more easily applied. 


THEOREM 10.1. Jf 
1 
) (x > 0;k = 0,1, 2,---), 


f(a) = of 
(10.1) 
f(a) = (x— ©;k=0,1,2,---), 

x 


then equation (9.8) is true. 


For, if one expands (9.3) and (9.4) by Leibniz’s rule, one sees that (9.3), 
(9.4), and (9.5) are true for all positive integers k by virtue of the relations 
(10.1). Note that no positive or negative integral power of x satisfies (10.1). 
For such functions f(x) the right-hand side of (9.8) is zero. 

For use in the proof of our next result we establish 


Lemma 10.21. If k is a positive integer, and if 


soy = 0(=) (10), 


t* 


f O(u)du = O(t**1) (t 0). 


For, integration by parts gives for e>0 


= — —(2k + (4s) du, 


Making use of our hypothesis regarded f‘*—» (#), we obtain 


t t 
f = —(2k + yf u2kf(k-1) (4) du 
0 0 


= O(t**) 0). 
Lemma 10.22. If f(x) is of class C® in the interval 0<x< ©, and if the limits 
1 
lim (4) | du 
€ 


exist, then 
Ak! 
x k+1 


(= 


for a suitable constant A. 


then 
) 


38 D. V. WIDDER [January 


The hypothesis for k =1 assures us that the Cauchy value of the integral 
1 
f Yaw 
0 


exists, and hence that there exists a constant A for which 


if(t) ~ A 


We now proceed by induction and assume that 


In particular 


so) = 0(—) =0,1,---,k—1). 


1 
lim (4) du 
e—0+ € 


exists, it follows that . 


Hence 
t 
f [u2*+1f) (4) | du = 
0 


for a suitable constant c;. By successive integrations we have 
t 
f (u)du + P,(t) = O(t*t!) 0), 
0 


where P,(é) is a polynomial of degree at most equal to k. By Lemma 10.21 
P,(t) = O(¢**") (¢— 0). 


But this is impossible unless P;(¢) is identically zero. However, 


+ Pi () = OC), 


(9) = O(t*) 


1 


(t > 0). 
iG) (t > 0). 
Since 
(t— 0), 
(t— 0), 
(¢— 0). 


1938] THE STIELTJES TRANSFORM 39 


That is, this last relation must hold for all positive integers k. By use of a 
theorem of Hardy and Littlewood* the proof of the lemma is completed. 
We can now prove 


THEOREM 10.2. If for each positive integer k 


R 
(10.2) = 08) 


then 
Ak! 
(10.3) (— 1)*f(x) ~ (x0; k = 0,1, 2,---), 


(10.4) (x) 0;k=0,1,2,---), 
where 
(10.5) A = lim f(x). 


The conclusions (10.3) and (10.5) follow at once from Lemma 10.22. To 
prove (10.4) we have 


for each positive integer k. This shows that 
= OG") (t— 
from which (10.4) follows at once. 
Our next result is 
THEOREM 10.3. If f(x) satisfies (10.2) for each positive integer k, and if 


(10.6) lim f(x) = 0, 


then 


(10.7) f(a) = lim f 


* Las A 
L f(x) ] 
t+a a 
where 
A = lim xf(x). 
z—0+ 


* See, for example, E. Landau, Darstellung und Begriindung einiger neuerer Ergebnisse der Funk- 
tionentheorie, Berlin, 1929, p. 58. 


|| 


D. V. WIDDER [January 


A 
g(x) = f(x) —- 
x 


1 
(2) = (x 0;k = 0, 4, 2,--+), 
x 


A+1 


by Theorem 10.2. The same theorem shows that (10.4) is true for g(x). This 
combined with (10.6) implies 


1 
= (x 0;k=0,1,2,---). 
Hence, by Theorem 10.1, 


g(a) = tim f (a> 0). 
Since 

Le.t[f(x)] = 
we see that (10.7) follows at once. 


10.31. If the functions Li:[f(x)], (k=1, 2,---), are all 
bounded, then (10.6) implies (10.7). 


Coroiary 10.32. If 


(10.8) f | Lael f(a)] < 


then (10.7) holds. 
For, if p=1, then 


f | dt < @, 


so that (10.6) must hold. Clearly (10.8) implies (10.2) in this case. If p>1, 
Hélder’s inequality gives 


(10.9) fi | ds f(x) ] joa] 


so that (10.2) is satisfied. For k=1, (10.9) becomes 


40 
For, set 

Then 


THE STIELTJES TRANSFORM 41 


= 


Hence (10.6) is satisfied. That is, Theorem 10.3 is applicable. 

11. a(t) of bounded variation. Here we develop a necessary and suffi- 
cient condition that the equation (1.1) should have a solution a(¢) of bounded 
variation in the infinite interval (0, «). 


THEOREM 11.1. A necessary and sufficient condition that f(x) should have 
the representation (1.1) with a(t) of bounded variation in (0, ©) is that 


where M is some constant independent of k. 


To prove the necessity we have 


t 
La = és da(u), 


deca < 0, 


u*® 
| | dt af t af da(u) | 


= fw fa 
== sf | 


This proves the necessity. 
For the sufficiency, we have by Corollary 10.32 


fo) = tim 


ko 0 t + a a 
A = lim xf(x). 


z—0+ 


By a theorem of Helly* we can pick from the set of functions 


a(t) -f f(x) |du 


* E. Helly, Uber lineare Funktionaloperationen, Wiener Sitzungsberichte, vol. 121 (1921), p. 265. 


1938] 

where 

Then 


42 D. V. WIDDER [January 


a subset a;,(¢) which approaches a function a*(#) of bounded variation in the 
interval (0, ©). Then 
day; (t) A 
f(a) = lim + (a> 0). 


ino J t+a a 


By the Helly-Bray{ theorem we may take the limit under the integral sign 


and obtain 
0 t a a 


da(t) 
-f (a> 0), 


where a(#) vanishes at the origin and differs from a*(é) by the constant A for 
positive values of ¢. This completes the proof of the theorem. 
12. a(t) non-decreasing. Let us introduce 


DEFINITION 12.1. A function f(x) satisfies Property A if and only if 
Li[f(x)] = 0 (¢>0;% =0,1,2,---). 


Clearly this is equivalent to 
f(a) 20, (— 20 


1,2,---), 


or to 
f(z) 20,  (— 1) > = 1,2,---). 


In the proof of our result we shall need 

Lemma 12.11. If $(x) is of class C! in 0<x <1 and if $'(x) is bounded on 
one side in that interval, then —(x) is bounded on the same side there. 

The proof is obvious. 

THEOREM 12.1. Jf f(x) has Property A, then the relations (10.3) and (10.5) 
hold. 

Since xf(x) is a positive increasing function, it follows that the constant A 
of (10.5) is well defined. It will be sufficient to prove that 


1 
= of ) (x0; p = 0,1, 2,---). 


Since this has been proved for k=0, we may proceed by induction. Let us 


+ See, for example, G. C. Evans, The Logarithmic Potential, Discontinuous Dirichlet and Neumann 
Problems, American Mathematical Society Colloquium Publications, vol. 6, 1927, p. 15. 


1938] THE STIELTJES TRANSFORM 43 


grant then that these relations hold for p=0, 1, 2,- - - , k—2. By hypothesis 
(— 2 0 
By Lemma 12.11 
<M (0<x<1) 
for a suitable constant M. Also 
(— B 0 


from which we deduce in the same way that 


> (0< x <1) 


for a suitable constant NV. But 


Since the second term on the right-hand side is O(1) by our assumption 
(10.6), it follows that 


Nx + O(1) < [x*f(x)]“- < M (0<x< 1). 
Hence 
= O(1) (x 0). 
Expanding by Leibniz’s rule we see that 


= 0(<), 
x 


so that the induction is complete. Hence (10.3) is established. 


THEOREM 12.2. If f(x) has Property A, then there exist constants Ao, A, -- - 
such that 


A 
(12.1) (= 1)'f(x) = — (k 
x 


Since 
(- 1) ] > 0, 
it follows by successive integration that 


By Theorem 12.1 a similar inequality holds in the interval (0<x <1), so that 
(12.1) follows at once. 


44 D. V. WIDDER [January 


THEOREM 12.3. If f(x) has Property A and if xf(x) approaches a limit as k 
becomes infinite, then 


= | lim xf(x) — 


For, if 
B 
f(x)~— 


then Theorem 12.2 with the addition of the Hardy-Littlewood result referred 
to earlier shows us that 


Bk 
(12.2) (— ~ (x— @). 


k+1 


It is easily seen that the relations (12.2) imply that 
lim (— = (k — 1) — 1)!B. 


too 


There is of course a similar result for ‘0 following from the relations (10.3) 
which are necessary consequences of Property A. Hence 


f = af (- 1) #1 (4) ] de 


_ (k= 1)! 
— 2)! 


[B— A]. 


This proves the theorem. It is to be noted that the existence of B added to 
our hypothesis in this theorem is not a consequence of Property A, as one 
sees by the examples f(x) =1 and f(x) =(x)-"/*. Both satisfy the property but 
in each case B fails to exist. We can now treat the caseft of bounded non- 
decreasing functions a(?). 


THEOREM 12.4. A necessary and sufficient condition that f(x) should have 
the form (1.1) with a(t) bounded non-decreasing is that f(x) should have the 
Property A and that xf(x) should approach a limit as x becomes infinite. Further, 


(12.3) a(~) — a(0 +) = lim f 


The necessity of Property A follows from an inspection of the relations 


+ The author treated this case by another method in an earlier paper: D. V. Widder, A classifica- 
tion of generating functions, these Transactions, vol. 39 (1936), p. 244. 


THE STIELTJES TRANSFORM 
LoL f(x) = f(t), 
f(x) ] = aw f 


bed k 
da(u 
(t + u)?* 


Moreover, it is obvious that 


 xda(t) 
Im = hm = 
0 x + t 


a(o), 


To prove the sufficiency we first appeal to Theorem 12.3. This shows that 


<B-A 


where A, B are defined as in the proof of Theorem 12.3. Hence (11.1) is satis- 
fied, and, by Theorem 11.1, f(x) has the representation (1.1) with a(t) of 
bounded variation in (0, ©). To show that a(#) is non-decreasing we now 
appeal to Theorem 3.1. Clearly, on the assumption of Property A, the func- 
tions 


[ f(x) |du 


are non-decreasing functions of t. 

Finally, (12.3) is a direct result of Corollary 6.31. 

We turn next to the case of unbounded non-decreasing functions a(¢). For 
the discussion of this case we need 


Lemna 12.51. If f(x) satisfies Property A, then it approaches a limit as x 
becomes infinite. 


For, since 


— [u?f(u)] = 0 (u > 0), 


we have for 0 <y <x by successive integrations 
2 2 2 , (x a y)? 2 ” 
— + — + 2 0, 


whence 
lim sup f(x) = E S 3[y*f(y)]” (y > 0). 


Successive integration of the inequality 


[u2f(u)]” — 2B = 0 


1938] 45 
— 


46 D. V. WIDDER 


gives 
x*f(x) — y*f(y) — (« — y)[y*f(y)]’ — E(w — y)? 20 (O<y< x). 
Hence 
lim inf f(x) = E, 


or 
(12.4) lim f(x) = E= 0. 


Lemma 12.52. If f(x) satisfies Property A, then for each non-negative in- 
teger k the function [x*f(x) |™ is completely monotonic for x >0, 


(— 1)"[x* f(x) 20 (n = 0, 1, y ). 


By (12.4) and (12.1) it follows from the Hardy-Littlewood result quoted 
earlier that 


(2) = o(=) 0), 


Since . 
(= 0 (= > 0), 
and since 
Jim [x*¥f(x)]M=0 (m= k+1,k+2,---), 


it follows by successive integrations to infinity that 
(12.5) (— 1)"[x*f(x)]@+» > 0 b= 
(12.6) [x*f(x)]} => E20. 


It remains to show that (12.5) holds form=k,k+1,---. 
Let r be a positive integer, and replace f(x) by x’f(x) in Lemma 3.11. We 
obtain 


Hence by (12.5) and (12.6) 
(— 2 0 (k 
(12.7) (— xrf(x)} > 
But 
xr f(x) } ] = 0 (p= 0,1,2,---,k—- 


lim (— = (k — 1)(k —1)!4 20. 


z—0+ 


[January 
2r+1), 
=r+1). 
1;r>0), 


1938] THE STIELTJES TRANSFORM 


Hence successive integrations of (12.7) from zero give 
(— = 0 (k=r+1), 
and this completes the proof of the lemma. 


LemMA 12.53. If f(x) has Property A, and if 6>0, then f(x+6) has the 
same property. 


For, 


(- f(x + 6) | 
( (— (a + 6) f(x + 8) 


(12.8) 


By Lemma 12.52 
(— + + 20 
so that every term in the sum (12:8) is non-negative. 
Lemma 12.54. If f(x) has Property A and if 5>0, then 
_ feta) 
a: 


F(x) 


has the property and 
(12.9) lim xF(x) = — f(). 


For 
(— = (— 1)*[x + (k= 1,2,---). 
But the right-hand side is non-negative by Lemmas 12.52 and 12.53. Also 


= IO) _ 


-f@)20 


so that F(x) has Property A. By Lemma 12.51 we deduce (12.9). 
By use of this last result we can now prove the main result of the section. 


THEOREM 12.5. Property A for the function f(x) is necessary and sufficient 
that it should have the form 


da 
(12.10) f(x) =E +f 


where a(t) is non-decreasing and E=0. 


The proof of the necessity is made as in Theorem 12.4. 


47 


48 D. V. WIDDER [January 


For the sufficiency we have at once by Lemma 12.54 and Theorem 12.4 
for a given positive 6 


F(x) 


_ +8) — f@) 
= = —— 

— x o 
where §(¢) is non-decreasing and bounded. In fact 

B(0)= 0, B(~) = — f(~). 

But A(t) must be constantly zero in (0, 5), for otherwise it would have a point 
of increase there and by Theorem 1.7, f(«) would have a singularity for some 
positive x. But, by Lemma 12.52, f(x) is completely monotonic and hence 
analytic for x >0. Hence 


f(x + 8) = f(s) -f = +f -f 
0 0 0 


t+6 
= + f aa + +8), 


f(x) = =) + f +8) 
da(t) 
c+t 


(12.11) = io) + f 


where 


a(t) = f (u + 5)dB(u + 4). 


Clearly a(é) is non-decreasing. It is independent of 56 by Theorem 1.6. Hence 
(12.11) holds for all « >0, and our theorem is proved. 

13. $(t) of class L®, p>1. In this section we deduce conditions on f(x) 
which will insure its representation in the form (4.1) with 


(13.1) f leola< « 


for some constant p>1. The result is 


THEOREM 13.1. A necessary and sufficient condition that f(x) should have 
the form (4.1) with (t) satisfying (13.1) is that 


(13.2) rae < at 


(13.3) lim xf(x) = 0, 


1938] THE STIELTJES TRANSFORM 


where M is some constant independent of k. 


For the necessity we have from §4 


These integrals all converge absolutely since 


Vert (gk + (gk — 
J G+ | | < | | aw | | (2gh) ] 


1 1 
4 


by Hélder’s inequality. We also have for k>1 


pk—-lyk pla 
| | S det J Ga [a 
uk 


=== fo rau = | 


For k=1, this argument fails. However, in this case we can obtain our 
result by use of Hilbert’s double-integral theorem. ft 


| | 


up(u) 
du 
o (¢+ 


f | Li [f(x)] ef | | 


T(2) 


Li lf(x)] 


(13.5) 


f= 
Clearly (13.3) also holds, since 


t 
lim xf(x) = lim f o(u)du. 
0+ Jo 


z—-0+ 


+ See G. H. Hardy, J. E. Littlewood, G. Pélya, Inequalities, Cambridge, 1934, p. 229, Theorem 
319. 


49 

0 0 o (¢ + u)** 
where 


50 D. V. WIDDER [January 


Hence the necessity of (13.2) is established. 
Conversely, we see that (13.2) implies, by Corollary 10.32, that 


Lil f(«)] 


A 
f(a) = lim (a> 0), 


k- 


A = lim <xf(x). 


Furthermore, (13.2) impliest the existence of a subset &; of all the integers k 
and a function ¢(é) of class L? in (0, ©) such that 


Li. 
io 0 t+a 0 t+a 


Hence 


A * o(t) 

(13.6) f(x) =— + dt (x > 0). 
x + t 

But A is zero by virtue of (13.3), so that the theorem is established. 


CoroLiary 13.11. Conditions (13.2) are necessary and sufficient that f(x) 
should have the representation (13.6) with @(t) satisfying (13.1). 


CoroLiary 13.12. If f(x) has the representation (4.1), (13.1), then 


For, Fatou’s lemma gives 
f | < lim inf f | Le 
0 


This combined with (13.4) gives the result. 
14. Continuation, »=1. That Theorem 13.1 can not hold for p=1 one 


sees from Theorem 11.1. For this case we prove 

THEOREM 14.1. A necessary and sufficient condition that f(x) should have 
the form (4.1) with o(t) of class L in (0, ©) is that the functions Ly,.[f(x)], 
(k=1, 2,---), should all be of class L and that 


(14.1) lim f | Lewlf(x)] — LiL f(*)]| dt = 0, 
kl+0 J 9 


(14.2) lim x«xf(x) = 0. 


20+ 


+ See S. Banach, Opérations Linéaires, p. i30. The proof there given is easily extended to the 
case of an infinite interval. 


z—0+ 


1938] THE STIELTJES TRANSFORM 


If f(x) has the form (4.1) with ¢(é) of class L, then 


-f | du | | du (k 


so that the first part of our condition is necessary. For the second part we 
have 


k-1 


u*® 
| o(u) — g(t) | du 


| - a f G+ 


= af o(iu) — | du. 


k 


| Li[f(x)] at s a. wae 


gu) | 6 | ae 


But g(1) =0, g(é) is continuous at «=1, and a constant M exists such that 
| g(u)| << + M (0<u<o). 


Under these conditions 


u* 
lim af = g(1) = 0 


by Corollary 4.11. From this (14.1) is immediate. 
Conversely, the assumption (14.1) implies the existence of a function ¢() 
of class L such that 


(14.3) lim | Lee[f(x)] — o@) | dt = 0, 


(14.4) lim | LeeLf(x)] | de -f | p(t) | de. 
0 0 


Equation (14.4) combined with (14.1) implies (11.1) for a suitable constant 
M. Hence by Theorem 11.1 


BOSTON UNIVERSII' 
COLLEGE OF LISFRAL 


LIBRARS 


31 
Hence 
where 
ko 0 


D. V. WIDDER [January 


dalt 
a) = ff 


where a(t) is of bounded variation in (0, ~). But 


<f | — o( | dt 
Hence by (14.3) 


lim “Le Jat = f 


But by Theorem 3.1 


lim “Leal ]dt = a(u) — a(0 +) 


0 


By (14.2) 
lim «f(x) = a(0 +) = 0, 


z—0+ 


= (u = 0). 


This completes the proof of the theorem. 
Coro.iary 14.11. If f(x) has the form (4.1) with $(t) of class L, then 


lim Jat =f sou. 


15. ¢(t) bounded. To conjecture a condition for this case one would natu- 
rally allow p to become infinite in (13.2). This would lead to 


(15.1) | < 
But note that for k =1 we have (13.5) and that 


1 1 
lim r = lim r(2 -)r(-) = 00, 


In fact (15.1) is not necessary for the boundedness of ¢(). For, let $(¢) be 
equal to unity in (0, 1) and zero elsewhere. Then 


1 1 
Li1[f(x)] = log (1 -) (t > 0), 


52 

so that . 


1938] THE STIELTJES TRANSFORM 53 


and this function becomes infinite as ¢ approaches zero. We may overcome 
this difficulty by replacing (15.1) (k=1) by a condition on f(x) of a slightly 
different type. The result is stated in 


THEOREM 15.1. A necessary and sufficient condition that f(x) can be repre- 
sented in the form (4.1) with p(t) bounded is that 


(15.2) | Le [f(x)]| < M (¢>0;k = 2,3,---), 
(15.3) lim «f(x) = 0, 
z—0 


(15.4) lim f(x) =0 


for a suitable constant M. 
If f(x) has the form (4.1), and if 
| o(t)| <M (0<t<~o), 
then 


so that (15.2) is satisfied. Also 


aad t u 
lim dt = lim ¢(t)dt = 0, 


o «+t u—0+ 0 
= 0, 


so that (15.3) and (15.4) also hold. 

Conversely, (15.2) implies (10.2) at least for k=2, 3, - - - . Also (15.3) and 
(15.4) imply (10.2) for k=1. Hence (10.3) and (10.4) hold. But these com- 
bined with (15.3) and (15.4) give 


= o(—) (x30; k = 0,1,2,--° 


1 
= (sx =0, 1,2,--- 
x 


Hence we obtain by successive integration by parts the identity 


cx(— 1)* dt = — dyx* f 


By (3.2) 


1 

f 

/ 

zo X t 

f 

4 

| 

q 

i 


D. V. WIDDER [January 
— = lim f Peal fo) dt. 
ko 0 (x + t)? 
Furthermore (15.2) implies the existencef of a subset k; of the integers k and 
a bounded function ¢(#) such that 
t 
0 


Now let 0<x<y. Since the integral 


f 
o (x + 


is uniformly convergent in any closed interval of the positive x-axis, we 
have 


(x + + 24) 


Hence, for any fixed positive x, (15.4) gives 
t 
(15.5) f (y—» @). 
0 


40) =o - 9) f 


(x + + 4) y 
Moreover, since $(é) is bounded, there exists a constant N such that 


t N 
(0<t<o). 

t 
Hence we are in a position to apply a Tauberian theorem of Hardy and Little- 
woodt to the integral (15.5) considered as a function of y. The conclusion is 


that 


which is what we set out to prove. 
Corotiary 15.1. If f(x) has the form (4.1) with (t) bounded, then§ 


lim l.u.b. | LeeLf(x)] | = true max | o(t) |. 
< 0<t<@ 


t See, for example, S. Banach, loc. cit., p. 130. 

¢ G. H. Hardy and J. E. Littlewood, Notes on the theory of series (XI): on Tauberian theorems, 
Proceedings of the London Mathematical Society, vol. 30 (1930), p. 33. 

§ For definition of true max see, for example, S. Banach, loc. cit., p. 227. 


= 4(1) 
f(x) = J a, 


1938] THE STIELTJES TRANSFORM 55 


16. A more general case. We next investigate what functions f(x) can be 
represented by a convergent integral of the form 


dalt 
sta) = f 


with no restriction on a(é) except that it should be of bounded variation in 
every finite interval, and bounded in the infinite interval. To treat this case 
we need certain preliminary results which we now establish. We introduce a 
new operator M,,,.[ f(x) ] by the 


DEFINITION. 
Mi l[f(x)] = (= 1) tea 
M,,.[f(x)] = 

Our first result is contained in 

THEOREM 16.1. Jf 


(16.1) {WO = o(—) (t-0;” = 


1 
(16.2) = o(—) (t> o;n= 
then 


f(x) = lim dt (x > 0). 


f(x) ] 


(x + #)? 


By an application of Theorem 9.1 or directly by integration by parts one 


shows that 


Equation (3.2) now gives the result desired. 
ie We shall next show that conditions (16.1), (16.2) follow automatically 
from the boundedness of M;,:[f(x) ]. 


THEOREM 16.2. If 
(16.3) = O(1) (t > ~,t-0;k = 1,2,---), 


then (16.1) and (16.2) are true. 
For, one sees easily that (16.3) implies 


0, 1,2,---), 
f 

6,1, 2,---), 


56 D. V. WIDDER [January 


= (t> ©, k = 0, 


of which (16.2) is a trivial consequence. Furthermore (16.3) implies 

(16.4) = O(1) (¢— 0), 
(16.5) 4-2) = O(1) (t— 0). 
If we now assume (16.1) for (n=0, 1, 2, - - - , 2k—4) we see that it also holds 
for n=2k—3 by (16.5) and for n»=2k—2 by (16.4). Since it obviously holds 


for n=0 by (16.4), &=1, it must hold in general. 
By use of these results one may now prove 


THEOREM 16.3. A necessary and sufficient condition that 
(t) 
o (x + #)? 


where o(t) is bounded is that there should exist a constant N for which 


(16.6) f(x) = 


(16.7) | N (k = 1,2,3,---;0<a%< @). 


Note the contrast of this result with Theorem 15.1 by reason of the ab- 
sence of any conditions of the type (15.3), (15.4). The proof follows by use 
of Theorems 16.1 and 16.2, and is omitted. 

For the applications to follow it is desirable that the conditions of Theo- 
rem 16.3 involving the operator M,.,.[f(x) |] should be replaced by one involv- 
ing Lx,.[f(x) ]. We thus prove 


THEOREM 16.4. A necessary and sufficient condition that f(x) should have 
the representation (16.6) with @(t) bounded and satisfying 


t 
(16.8) f o(u)du ~ At (t — 0) 
0 
for some constant A is that 


for some constant N. 


If f(~) has the representation described, then it follows from an abelian 
theorem, easily proved, that 


(— 1)*k!A 
(16.10) 
x 


(x — 0). 


k+1 


(16.9) | (b= 1,2,3,---) 
— 


1938] THE STIELTJES TRANSFORM 


But (16.10) shows that 


k—1 
A 


(16.11) k 


(16.12) = M,,[f(x)] — A. 


By Theorem 16.3 the right-hand sides of these equations have upper and 
lower bounds independent of k, so that the necessity of (16.9) is established. 

Conversely, the existence of the integrals (16.9) implies by Theorem 10.2 
the existence of a constant A such that 

Hence (16.11) and (16.12) are again true and (16.9) implies (16.7). That is, 
(16.1) holds with ¢(#) bounded. It remains only to establish (16.8). But this 
follows from (16.10), k=0, and the boundedness of ¢(#) by a known Tau- 
berian theorem. ft 

By use of these preliminary results we now prove 


THEOREM 16.5. A necessary and sufficient condition that 


dalt 
f(x) -f 


where a(t) is a normalized function of bounded variation in every finite interval 
and is bounded in the infinite interval, is that there should exist a constant M 
and a positive function N(t) such that 


R 
(16. 13) f dt | <M (R > 0;k = 1,2,3,---), 
0 


(16.14) | dt < N(R) (R > 0; k = 1, 2,3,---). 


If f(x) has the representation described, then 
a(t) 


(16.15) f(x) = a 


f ~ a(O +) 0), 


+ G. H. Hardy and J. E. Littlewood, On Tauberian theorems, Proceedings of the London Mathe- 
matical Society, vol. 30 (1930), p. 33, Theorem 5. 


57 


58 D. V. WIDDER [January 


so that (16.3) follows from Theorem 16.4. Also 
R 
lim | LeeLf(x)]| dt = V(R) — VO +) 
0 


from Theorem 6.3, so that the existence of N(R) is insured. 
Conversely, (16.13) implies that f(x) has the form (16.15) with a(?) 
bounded and 


f ee ~ At (t— 0). 


Set 


k— 1 
A. 


-f Liul f(x) ]du = (- 
But 


and we showed in §3 that this integral approaches a(#) except perhaps in a 
set E of measure zero. But the variation of a;(¢) in (0, R) is clearly not greater 
since E is of measure zero. But 
da(t) 
+f 
o («+ 2)? x + tlo «t+ 
It might be supposed that we could remove the restriction of boundedness 


than N(R) by (16.14). Hence a(#) is a normalized function of bounded varia- 
alt t 
a(t) a(t) 
The first term on the right-hand side is zero since a(#) is bounded and a(0) =0. 
of a(t) by considering the function 


tion in (0, R) if suitably redefined. This redefinition has no effect on f(x) 
Hence the theorem is completely established. 


f(x) — f@) 


F(x) = 


as was done in §12. For even if a(t) becomes infinite, F(x) satisfies (16.13) and 
(16.14) when the integral 


(16. 16) f(x) 


0 x+t 


converges. The converse is not true. If F(x) satisfies (16.13) and (16.14) we 
have indeed 


| 
| 


THE STIELTJES TRANSFORM 

f(x) — f(6) Balt) 
6—x 0 (x + t)? 

where {;(¢) is bounded. If f(x) had the representation (16.16), then 


t da(t 
= f 


But if =é sin then 6;(t) is bounded. For 


F(x) 


‘ame 


usin u 
f 

o (6+ u)? 
converges. But by Theorem 1.2 the integral (16.16) can not converge unless 
a(t) =0(t), which is not the case in the example considered. 

17. The Paley-Wiener inversion operator. We conclude by showing the 

relation between the operator L;,,[f(x) ] and the inversion operator given by 
Paley and Wiener.} They showed that if 


f(x) 


where ¢(#) is a function of class L? in the interval (0, ), then 


1 m(—1)"/ 


ml? (2n)! dt 


We may abbreviate this precise result by the symbolic equation 


1 
o(t) = (cos rD)(#'/*f()), 


at!2 


On the other hand 


1 k-2 
(17.1) (1 


n+1 : 


where 


+ For reference see Introduction. 


and 
| 
where 
P 
D=t—- 
dt 


60 D. V. WIDDER 


i3 3 5 2k—5 2kR—3\ 2k —3 2k—-1 
2\2 2 4 4 6 2k—4 2k—4/2k—2 2k 
To prove this one has only to verify that the differential operators on op- 


posite sides of equation (17.1) have the same system of fundamental solu- 
tions, 


x 


and to compare the coefficients of f‘°*-»(¢) in the expanded forms of both 
operators. This coefficient for the left-hand side of (17.1) is 


while for the right-hand side it is 


k-2 
Sk Il ( 2 Jom, 
T 2n 1 


Equating these coefficients gives (17.2). 


Since 
2z 
= li 1— 
TI ( 2n+ 
and 
2 f2n—12n+1 
2 
Tv n=1 2n 2n 
it follows that 
lim 
and that 
1 
lim Li,.[f(x)] = (cos 


so that the two operators are symbolically equivalent. 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 


1 1 1 


ANALYTICAL GROUPS* 


BY 
GARRETT BIRKHOFF 


INTRODUCTION 


1. Abstract groups. The present paper will deal with abstract continuous 
groups. This means that it will discuss symbols which behave like transforma- 
tions, without specifying the domain on which the transformations operate. 
The reader will be assumed to be conversant with abstract groups as algebraic 
entities. 

2. Questions in the large. It is well-known that the theory of continuous 
groups in the large differs essentially from the theory in the small. Some 
things, such as the one-one correspondence between closed subgroups of a 
Lie group and subalgebras of the Lie algebra of its infinitesimal generators, 
are true only locally;f others, such as the introduction by Wey] and Haar of 
invariant mass, are possible only when one deals with groups in the large. 

The present paper is a theory in the small exclusively; it neither involves 
implicitly nor resolves explicitly the difficulties in the large. In this it re- 
sembles the original theory of Lie. 

3. Actual contents. Thus the paper avoids two large classes of questions. 
What questions does it answer—what are its assumptions, and how can one 
summarize its conclusions? 

The paper deals with systems (called “analytical groups”) in which an 
associative multiplication is defined, and which can be so mapped on a 
Banach parameter-space that if one multiplies all elements by any fixed 
element near the origin, vector differences are left nearly invariant. 

It is proved that if G is any analytical group (more properly, analytical 
group nucleus), then 

(1) G is a topological group nucleus in the usual sense. 

(2) One can introduce canonical parameters into G. 

(3) G has an infinitesimal (Lie) algebra L(G). 

(4) The analytical subgroups of G correspond biuniquely to the closed 

* Presented to the Society, December 31, 1936; received by the editors December 17, 1936 and, 
in revised form, May 3, 1937. 

{ Again, the group of topological automorphisms of the group of the torus differs radically 
from that of the group of translations of the plane, in spite of the fact that these two groups are 
locally isomorphic. 

t A Banach space is of course simply a system having certain prescribed elementary properties 
of euclidean space which are shared by various important function spaces. Cf. §8. 


61 


62 GARRETT BIRKHOFF |January 


subalgebras of Z(G), a subgroup being normal if and only if the correspond- 
ing subalgebra is invariant. 

(5) If G is under canonical parameters, then there exists a formal series 
of polynomials determined by Z(G) which expresses the rule for forming 
group products. 

(6) One can define product integrals for functions with values in L(G), 
which include the Lebesgue integral (the case G is the additive group of real 
numbers), and all known product integrals (the case G is a group of matrices). 

(7) Quite general functions x(A) with real arguments and values in a 
Banach space B determine formal series in elements of B and their brackets, 
which express the product integral fx(A)dA under canonical parameters for 
any analytical group G whose parameter-space contains B and whose “com- 
mutation modulus” is sufficiently small. 

(8) All the operations defined (e.g., vector addition under canonical pa- 
rameters, product integration) are topologico-algebraic—preserved under 
topological isomorphisms. 

4, Extension to infinite dimensions. Perhaps the main advantage of the 
above assumptions, is the fact that many infinite continuous groups satisfy 
them. This marks a real advance in the analytical theory of groups.* 

The infinite-dimensional analytical groups treated in the literature are of 
two kinds: the infinite continuous groups of analytical transformations 


af = (i=1,---,m) 


of n-dimensional space discussed by Lie [10]} and Cartan [4], and the groups 
of linear operators on Hilbert space recently studied by Delsarte [6]. Each 
of these authors omits to define the meaning of the convergence T,—T of a 
sequence of transformations 7, to a limit 7—in other words, to define the 
topological structure of the corresponding abstract groups. 

This omission, and the omission to establish a rigorous correlation be- 
tween the actual transformations of such groups and so-called “infinitesimal” 
transformations, are not trivial. In fact, although the present paper supplies 
a complete theory for a class of groups including those studied by Delsarte, 
the author does not even know what the facts are in the case of the groups 
studied by Lie and Cartan. Part of the difficulty is that the group manifolds 
are not metric-linear; part of it is that canonical parameters do not define 
even locally a one-one representation of the group manifold. 


* Cf. Abstract 41-3-129 of the Bulletin of the American Mathematical Society (1935); also 
Continuous groups and linear spaces, Matematicheskii Sbornik, vol. 1 (1936), pp. 635-642; an address 
delivered at the First International Topological Conference, Moscow, September 5, 1935. 

t Numbers in brackets refer to the bibliography at the end of the paper. 


1938] ANALYTICAL GROUPS 63 


5. Continuous groups: topological and analytical. This illustrates the im- 
portance of geometrical properties of group manifolds; we shall now see how 
continuous groups can be classified on a purely geometrical basis. 

A “continuous” abstract algebra (whether group, ring, or field) is a sys- 
tem whose elements are simultaneously the points of a geometrical manifold 
and the symbols of a formal calculus, and whose algebraic operations deter- 
mine “smooth” functions of the manifold into itself. By letting the geometry 
of the manifold suggest the proper definition of smoothness, one is led to a 
purely geometrical classification of continuous abstract algebras. 

Thus with groups whose manifolds are general topological spaces, one natu- 
rally regards “smoothness” of the group operations as meaning that group 
multiplication and passage to the inverse are continuous in the topology of 
the group manifold. Such groups are called topological. 

Similarly, with groups whose manifolds are n-dimensional analytical va- 
rieties, it is natural to assume that the group operations are analytical in the 
coordinates; this leads to the usual concept of a Lie group. 

Now it is a remarkable fact, that two analytical systems which are con- 
tinuous images of each other, are in general analytical images of each other. 
This seems to hold even in pure geometry: thus dimensionality, originally 
known to be invariant only under analytical transformations, is now realized 
to be a topological invariant. We shall extend the domain of validity of this 
principle below, by proving that continuous isomorphisms between Lie groups 
are necessarily analytical.* 

6. Groups as topological algebras. The result just stated, combined with 
(8), suggests that one can develop a theory for analytical groups in which 
group multiplication and passage to the limit are the only notions introduced 
as undefined primitives. 

Indeed, this program is technically feasible: it is shown below that one can 
give topologico-algebraic definitions of analytical groups. But as the argu- 
ment is really metric, it would be misleading to make it pseudo-topological— 
even though it is less analytical and more topologico-algebraicf than any 


* Discontinuous (and hence non-analytical) isomorphisms exist; there is one between the group 
of translations of the line and the group of translations of the plane. To see this, form in each an 
independent basis with respect to linear combination with rational coefficients. (However, van der 
Waerden, Mathematische Zeitschrift, vol. 36 (1933), pp. 780-787, has shown that isomorphisms 
between compact semi-simple Lie groups are always analytical.) Conceivably two Lie groups which 
are isomorphic and have homeomorphic manifolds are eo ipso analytically isomorphic. 

t Especially since O. Schreier [15] has obtained so much information about group manifolds 
by such a theory. 

¢ Thus pure group algebra—especially that of commutation—is shown to yield many results 
(especially in Chaps. IV-V) which could not be obtained by general analytical methods. 


64 GARRETT BIRKHOFF [January 


previous reasoning yielding the same results. 

All this relates to the well-known problem of determining the weakest 
analytical assumptions demonstrably equivalent to an assumption of unre- 
stricted analyticity. The weakest assumption in the literature* (cf. [11]) is 
that the function of group multiplication has continuous second derivatives. 
It is shown below that if one assumes continuous first derivatives, then one 
can deduce the whole theory{ of abstract Lie groups. 


CHAPTER I. TECHNICAL MACHINERY 


7. A remark on notation. It will shorten the argument in the sequel to use 
the following notational conventions: M(A) for any positive function of a real 
variable \ such that lim,..M(A)=0; O(A) for any such function satisfying 
O(A)<K-|\| for some K<+~; for any such function satisfying 
o(d) <|A| -M(A) for some M(A). (The relation of the last two definitions to 
Landau’s well-known 0-O notationt is obvious.) 


Thus let -- - ,x,) andw(m, - - - , be any two real-valued functions 
of the same (not necessarily numerical!) variables x, - - - , x,. By the preced- 
ing definition, 

means that given 7>0, 6>0 exists so small that ¥(m, - - - , x,) <6 implies 
%)<m. The inequalities and 
So0(W(m, --- , x-)) have similar meanings. 


It is obvious that in terms of this notation, the following substitutions 
are legitimate: 


(7a) O(A) for o(A), and M(A) for O(A). 
(78) M(aA) for M(O(A)), and O(A) for O(Q()). 
(77) M(A +44) for M(A) + M(u). 

(76) M (a) for M(aA)/M[1 — M(a)]. 


Thus if ¢(m,---, x,)), then by 

It goes without saying that the M@-functions, o-functions and O-functions 
appearing in the text vary from group to group, and from inequality to in- 
equality—although since only a finite number of such functions are used in 


* Except when dealing with compact (von Neumann) and abelian (Pontrjagin) groups, where 
one need only assume that one has a topological group locally homeomorphic with euclidean space. 

¢ The author announced this result in Abstract 41-5-192 (1935) of the Bulletin of the American 
Mathematical Society. 

¢ Cf. G. H. Hardy, Pure Mathematics, 5th edition, Cambridge University Press, 1928, p. 448. 


1938] ANALYTICAL GROUPS 65 


dealing with any one group, there exists a single M(A), o(A), and O(A) which 
works in all inequalities for that group. 

8. Formal definition of ‘‘analytical group.” The properties of “analytical 
groups” which will be assumed were indicated in §2; they can be stated ex- 
plicitly as 

DEFINITION 1. By an analytical group will be meant any region about the 
origin of a Banach space, in which an associative multiplication is defined 
for elements near the origin 0, satisfying 


(1) forall x. 

(2’) | (wa — xb) — (a — b)| M(|«| +] +] 

In words, the origin is the group-identity e, and vector differences are nearly 


invariant under group translations 77 : a—«ay. (By xy or xo y is meant the 
group product of x and y.) 


(By a Banach space is meant a B-space in the sense of Banach [1 ]|—that 
is, a linear space in which an absolute value |x| is defined which (1) is posi- 
tive for xO, (2) satisfies the triangle inequality |x+y| <|«|+]y|, (3) is 
multiplied identically by || under any scalar expansion x—)x, and (4) 
makes the space complete*—such that if limm,»..|%m—2n| =0, then x exists 
such that lim, ...|*—2,| =0.) 

9. A topological group nucleus. Combining (2’)—(2’’), we get immediately, 
(2) (way — aby) — (a — = M(| +] +] 
Again, setting b=e=9 in (2’) and a=e=@ in (2”), one obtains, 

(3””) |zoy—(e+y)| a1, 

which can be combined into the single inequality 

(3) = +] 2] +] 91). 
In words, near the origin group translations 7; differ little from the corre- 
sponding linear translations Lz : a—-a+x+y. 


(9a) Multiplication is continuous near O. 


Proof. If |x|, |y|, |a| and |}| are small, then 
| (xo y) — (a05)| =| {(xo y) — (@oy)} + y) — (20d)} | 
= 


* Incidentally, {x,} is metrically “fundamental” if and only if it is “fundamental” in the 
topologico-algebraic sense (of van Dantzig) that limm n..%m !Ox,=0. Cf. (98). 


i 
4 

t 

7 

a 

4 

xt 

«3 

a 

4 

Py 

if 


66 GARRETT BIRKHOFF [January 


(98) Every sufficiently small element x has a unique inverse x satisfying 
and <2-|x]. 

Proof. Suppose M(5|x|)<3. Define yo=@ and by induction 
— (xy,). Then 


| = | (xyn41 — — — Yn) | by definition, 
< M(|x| +] +] by (2’), 


since | =| —xyn| =|xyn|. It follows by induction that 
<4r+t.|x|, and | M(|x| +| +] yn42]) <3. 
Hence limm,n+«|¥m—n| =0, and so by completeness a y exists satisfying 
|y| <2-|«| and lim,...|y—yn| =0. Now by (9a), xy=9, and so y is a right- 
inverse of x. Similarly x has a left-inverse z with zx =@. Moreover y= (zx) y 
=z(xy) =z; hence y=z=a7" is a full inverse of x; its uniqueness follows 
since xx’=@0 implies x’ = x)x’ =a (xx’)=271, while implies 
(9y) Passage to the inverse is continuous near O. 


Proof. Let (x+) be given. Substitute 2~! for yo and x+w for x in the 
proof of (98). By (9a), (x+u)y<2-|u| in a small enough neighborhood; 
hence in the construction of (x-+)-! by successive approximation, 


| (e+ ut — S| yo] = 4-| I. 

We can summarize (9a)—(9y) in 

THEOREM 1. Every analytical group contains a topological group nucleus in 
the usual sense.* 

A topological space in which an associative multiplication is defined satis- 
fying (9a)—(97) everywhere is called a topological group (cf. [15]). 

(95) <|x| +0(|x|); in fact, =0(|x]). 

Proof. By (3’), |x-+2~"| <M(|x| +|2~*|)-| ; but by (98), M(|x| +|2~*|) 
=M(|x|). Hence a-!=—«x+wu, where |u| =M(|x|)-|x| =o(|x]|), proving 
the result. 

Digression on axiomatics: Setting y=0 resp. x =@ in (3’) resp. (3), we 
obtain (1). Further, near 9, (3’), (3) make by=a imply that | y| is nearly 
|b—a|. Hence if we are dealing with a topological group nucleus, then (2’) 
and (2’’) hold. (Proof: By symmetry, it suffices to prove (2’). But by (3’), 
writing b-!a=y, whence a=by, 


* Cf. B. L. van der Waerden, Vorlesungen tiber kontinuierlichen Gruppen, Gottingen, 1932. For 
the analogous notion of a Lie group nucleus (alias “germ,” cf. [11]). 


ANALYTICAL GROUPS 


| (a-— 6) — y| =|by y| +] 4])-| 
| (xa — xb) — y| =| xby — — y| = M(| «| +16] +] ¥])-] 9. 


And so by the triangle law, since by continuity M(|y|)=M(|a-d]) 
<M(|a|+|b|) (cf. §7), 


(2’) (wa — xb) — (a — 6)| M(| +] 


10. Groups in the large. Let H be any full topological group. Obviously 
any one-one bicontinuous map of a neighborhood of the identity of H onto 
a region of a Banach space which satisfies (1), (2’), (2’”)—or alternatively, 
by the last paragraph, (3’), (3’’)—defines that neighborhood as an analytical 
group. We are unable to prove* that any system satisfying Definition 1 is 
conversely a piece containing the identity of a full topological group. 

Full topological groups which can be mapped locally onto Banach space 
in such a way as to satisfy Definition 1 will be termed full analytical groups; 
this will distinguish full groups from the analytical group nuclei with which 
we shall be concerned below and which, for brevity, we have called simply 
“analytical groups.” 

11. Changes of parameters. It is important to know which transforma- 
tions of Banach spaces play the role of analytical coordinate transformations 
in the theory of abstract Lie groups—that is, which when applied to a given 
analytical group G attached to a parameter space, turn G into another topo- 
logically isomorphic analytical group H. 

One can specify at once two classes of such transformations associated 
with an arbitrary Banach space B, namely: 

(11a) The group of “distortions” of B—that is, of those transformations 


* This has been proved for finite continuous groups by E. Cartan ([5], p. 18). Cartan omits to 
mention the decisive fact that if L is any Lie algebra, and N is its largest invariant “integrable” 
subalgebra, then L contains a semi-simple subalgebra S such that SNN=0 and S+N=L (cf. 
J. H. C. Whitehead, Proceedings of the Cambridge Philosophical Society, vol. 32 (1936), pp. 229- 
238). This omission led Mayer-Thomas to question ([11], p. 806) the validity of Cartan’s proof. 
Cartan has since published another equally technical proof (Sur la Topologie des Groupes de Lie, 
Paris, 1936, p. 22). 

Neither of these proofs can be extended to the infinite continuous group nuclei treated below; 
each depends on lemmas which need not be true in infinite dimensions. For instance the fact that 
not all closed linear subspaces S of Banach spaces B have complements T such that S NT=0 and 
S+T=B prevents one from using Cartan’s special proof for solvable groups. 

On the other hand Mayer-Thomas’ argument (due independently to Paul Smith) for the case of 
group nuclei which can be embedded in a full group generalizes to infinite continuous groups—one 
takes the subgroup of the full group generated by the nucleus given, and retopologizes this subgroup 
by redefining distance as geodesic distance along paths in the subgroup. 

Esthetically, one would expect to find a simple proof that every analytical group nucleus can be 
embedded in a full group, since it is easy to define the full group, if one knows that it exists. 


1938] 67 
Bet 
i! 
i 
+ 


68 GARRETT BIRKHOFF [January 


T of B into itself which leave @ fixed and satisfy (*) | 7(a+x)—T(a)—2| 

(11b) The class of alterations of the norm function |x| of B to a new norm 
function ||x|| such that the ratios |x| /||x|| and ||«||/|«| are uniformly 
bounded. 


Remark. The latter correspond one-one to choices of bounded open con- 
vex regions ||x|] <1 of B. (Cf. A. Kolmogoroff, op. cit. in §17.) 


THEOREM 2. Any succession of transformations of iypes (11a), (11b) of the 
parameter-space of an analytical group G, turns G into a topologically isomorphic 
analytical group. 


Proof. It is obviously sufficient to prove the theorem for single transfor- 
mations of types (11a) and (11b); again, the main difficulty is to prove ana- 
lyticity. With (11b), one needs merely write |x|/R<||x||<R-|x|, and to re- 
place M(A) in (3) by R?M(A/R) M(A). (Cf. §7.) 

Consider case (11a). Setting a+x=6 in (*), we see that (**) | T(b)—T(a) 
—(b—a)| <M({a| +|6|)-|6—a|—i.e., vector differences, and hence abso- 
lute values near © are nearly invariant under T. Hence—the proof as in (98) 
is by successive approximation—T is one-one and so by (**) bicontinuous 
near the origin. Therefore it suffices to prove (3’), (3’")—or even, by sym- 
metry, (3’). This we shall do. Note that 7" is of type (11a), and leaves vector 
differences near 0 almost invariant. Hence 


| T-(a + x) — (T-"(a) + T-(x))| +] x|)-| «|, 
| T-"(a) T-"(x) — (T-(a) + T-(x))| M(|a| +| «|)-| x], by (3’. 
Hence by the triangle inequality, 
| T-(a) 0 T-(x) — + x)| +] «| )-| 2| 
and so, by (**), 
| T(T-"(a) o T-(x)) — (a+ x) | < M(| a| + | «| ): | x|. 
But this is (3’) in terms of the new parameters, q.e.d. 
We shall regard topologically isomorphic groups as essentially identical— 
as differing merely in their parametric representation. 
12. Rectifiable paths. Let us recall a few familiar geometrical notions, so 
as to have a consistent notation and terminology for subsequent use. These 


notions are proper to abstract metric spaces, and so apply to Banach spaces. 
By a path is meant a continuous image P: p(d) of a line intervalt [0, A]. 


1 The ideas go back to Fréchet’s Thesis; cf., also, K. Menger, Zur Metrik der Kurven, Mathe- 
matische Annalen, vol. 103 (1930), p. 471, §5. 
t As is conventional, [0, A] denotes the set of real numbers \ which satisfy OS[ASA 


1938] ANALYTICAL GROUPS 69 


Two paths P and Q are called “geometrically equivalent” (written P ~Q) if 
and only if they can be identified by proper choice of parameters—i.e., if and 
only if one can establish a one-one sense-preserving correspondence between 
the intervals of which they are images, such that corresponding points have 
the same image. Clearly the relation of being geometrically equivalent is re- 
flexive, symmetric and transitive. 

Again, by a segment AP of P is meant the image of any subinterval A: 
[Ai, Ae] of [0, A]. By a partition m of P is meant a division of [0, A] into suc- 
cessive subintervals Ax], where Ao =0, A, =A, and k=1,---,”. By 
the “product” of any two partitions z and 7’ of P is meant the partition 7-7’ 
whose subintervals are the intersections of the subintervals of 7 with those 
of x’. And z is called a “subpartition” of x’ (in symbols, 7 <7’) if and only if 

By the z-approximate length of P under any partition 7 is meant 
|P|,=>-7_,|PA.)—P(r4)|, and by the “length” of P is meant |P| 
=sup |P|,. A path P is called “rectifiable” if and only if |P| <+ ©. Obvi- 
ously 

(12a) If P~Q, then |P| =|Q|. 


The “diameter” |x| of a partition x of a rectifiable path P is defined as 
sup |AP|. It is not hard to show 

(128) | P| 

And since || <|’| provided <7’, we see 

(12y) |P| =lim,,| P|, in the sense of Moore-Smith.t 

13. More notation. We shall now introduce some special but natural no- 
tation for handling rectifiable paths issuing from the origin (=identity) of an 
analytical group nucleus. 

If P, is any path with domain [0, A;,], then denotes — px(0). 

By the path-sum of r admissible paths P;, - - - , P,, will be meant the path 
P=P,@® --- @P, formed by adding to P:® - - - @P,_; a segment congruent 
to P, under linear translation through ¢(P:@ - -- @P,-.). And by the path- 
product of the P;, will be meant the path P=P,0 - - - oP, formed by adding 
to P,o --- oP,_, a segment congruent to P, under group left-translation 
through #(P,0 --- oP,_:). Thus P and P have [0, Ai+ --- +A,] for do- 
main, and for OSAS Axy, 


+ This means that, given any neighborhood of | P| , one can find a wo such that <7 implies 
that | P|, lies in that neighborhood of | P| . 


(13.1) 


| 
k 


70 GARRETT BIRKHOFF [January 


Since linear and group translations leave distances invariant resp. nearly in- 
variant, P and P are admissible. 

We shall now develop an abstract correspondence between paths which 
generalizes the correspondence between the sum-integral fx(A)d\ and the 
product integral fx(\)dd of functions whose values x(A) are matrices. (N.B., 
t(A,P) is the analogue of «(A)AX.) 

Accordingly, let P be any admissible path, and let 7 be any partition of P 


into segments A,;P,---,A,P. Denote by P; the image of A;P after linear 
translation through —#(Pi@ --- @Px-1), and by the image of A,P after 
left-multiplication by the group-inverse of 1(Q:0 --- OQ,-1). Then by con- 
struction 


We shall define the two dualistic paths 
*=Pi0---oP, and 


formed by interchanging the operations of path-addition and path-multipli- 
cation. Then we shall prove that the P,* and P,f approach fixed limiting 
positions P* and Pj as | | tends to zero. 
14. Evaluation of paths. Of course, the meaning of this statement depends 
on how one defines limits—on how one fopologizes the “space” of images of a 
fixed interval. 
Let P and Q be any two images of the same interval [0, A]. We shall make 
the definition 
| P—Q| = sup p(d) — 9Q). 
OSASA 


It is clear that this definition of distance makes the images of [0, A] the 
“points” of an abstract metric space, in the sense defined earlier;{ this de- 
pends only on the fact that the images of [0, A] are themselves in a metric 


space. 

We now come to some statements involving group properties. In stating 
and proving them we shall write for (x10 --- x,), and for 
(uit +27). 


<O(di-1| 


Proof. By the triangle inequality, 


k=1 
t Fréchet, op. cit., p. 36, introduces this very definition of distance, and shows that it is metric. 


IIA 


ANALYTICAL GROUPS 


r—1 r—1 
x 
k=l k=l 
r—1 r—1 r—1 
k=l k=l kul 


by (3) and induction on r. Recombining—since, by induction on ,, 
get 


Lim!) so( Dial). 


k=1 k=1 k=1 k=l k=l 


(148) We have the inequality 


k=1 k=1 
— ml). 
k=1 
Hence if +] ]) <1, then 


II « II» s2( - »!). 
k=1 k=1 k=1 
Proof. By the triangle inequality iterated, 


- - s > (II =.) Il xs) 


k=1 k=l i=1 i=k+1 


(I=) »( II ys) = 


i=k+1 


k=1 k=1 k=1 


(14y) Let Pi,---,P, and Q:,---,Q, be admissible paths, each having 
the same domain as Q,. Further, let |P| denote °;_,|P:| and |Q| denote 
Then 


r 


And if |P| +|Q| is so small that M(|P| +|Q|) <1, then 


| 0 P,) — @io---0@)| Ol. 


k=1 


ies 


whine 


1938] P| 71 
im 
4 
= 


72 GARRETT BIRKHOFF [January 


Proof. The first inequality follows from (13.1) and the triangle inequality. 
The second follows from (13.1) and (146). 

Thus with paths of sufficiently small total length, both path-sums and 
path-products are uniformly continuous functions of their arguments in our 
metric “path-space.” 


Lemma. Let P be any sufficiently short path. Then if x’ <x, |P*—P,*| 
<M(|7|) and |P,+-P,+| <M(|z|). 

Proof. It is an essential preliminary remark that each segment of any 
P or P,t has nearly the same length as the corresponding segment of P, since 
it is obtained from it by group and linear translation of subsegments through 
relatively small distances—and such translations by (2) leave distances 
nearly invariant. 

Now write P;*=P,0 --- oP,. Clearly P,* is obtainable from P* by re- 
placing each component --- by the path P,=Pi10 
o P,,,. But referring to (14a),we see that | P, —P,| 
Hence by (147), 


| Pt — P# < Px| = Pl. 
k=1 


Similarly, write - - - @Q,. ClearlyP,-f is obtainable from P,f by re- 
placing each component On =Q.10 ---OQ;,, byapath Ox =Qir® 
By (14a) and the preliminary remark, |0,—Q;:| 
Hence by (147), 


| Pot — Oe] = )-| I. 


THEOREM 3. Let P be any sufficiently short path. Then paths P* and PT 
exist such that 


|Pe—P*|< M(\r|) and |P,t—Pt| < 


Proof. By the above lemma, the P* and P,{ converge in the sense of 
Cauchy-Fréchet. But this means that for every fixed \, the p,*(A) and p,T(A) 
do, and hence (the space being complete) have limits p*(A) and pj(A). These 
limits define P* and P{; the inequalities of Theorem 3 then follow from the 
corresponding inequalities in the lemma and passage to the limit. 


(145) (P*)t=(PT)* =P. 

Proof. For every partition 7, (P*),{ =(P,t)* =P by definition. And to re- 
place each segment of P;*or P,{ by the corresponding segment of P* resp. Pt 
makes by (147) a proportionally small change in (P*),f resp. Hence 
(P*), and (Pt)#—P uniformly as | 0. 


1938] ANALYTICAL GROUPS 


(14€) |4(P*)—2(P)| <o(|P|) and | «Pt)—t(P)| <o(|P|). 

Proof. For every 7, ((P*)=t(P:) 0 --- o &¢(P,) where - - - 
@i(P). Hence the first inequality merely restates (14a). The proof of the 
second inequality is the same, since (cf. the preliminary remark in the proof 
of the lemma above) | P¢| <O(|P]). 

(145) (Pi®--- @P,)*=Po --- 
and 
(Pio --- OP,t. 

Proof. (P:® - - - @P,)* is in particular the limit as of 

--- oP, .«,), where 


Pi Pesce = Px. 


Thus it is the limit as sup |7,|—0 of (P:),* 0 --- 0 (P,)s*. By (14y), this 
limit is P*# 0 --- oP. This proves the first identity; the proof of the second 
is similar. 

Conversely (14€)—(14¢) define the correspondences P—P* and P—Pf. 

(14) If Q is any path, and A: [\, wp] is any interval of its domain, then 
q-*(A) g(u) =¢((AQT)*). 

Remark. By g—'(A) is of course meant [g(A) ]-". 

Proof. Set Of =P; p*(u) = p*(A) 0 t((AP)*) by (14¢); the result follows by 
transposing p*(A) =q(A). 

(140) If P=Q, then P* ~Q* and Pt ~Qf. 


Proof. Obvious from the definitions. 


CHAPTER II. CANONICAL PARAMETERS 


15. Scalar powers. In §§16-17, we shall consider straight rays P-: 
p:(A) =Ax (OSA and their star correspondents P+. 

Obviously | and so by (14¢), 

(15a) |¢#(P¥)—2| <o(|z|). 

(158) < M(\x| 

Proof. Let x denote the partition of [0, 1] into ” equal parts. Then setting 
x,=x/n and y,=(x+y)/n in (148), we get (158) for (P.,,)*, (P.)* and 
(P,)*. Passing to the limit as n>, we get (15). 

Combining (158) with (15a), we see that the so-called “canonical trans- 
formation” T: «—i(P+) satisfiest | T(x+y)—T(x)—y| 
is of type (11a). Hence (cf. Theorem 2) we have 


t By T: xt(P*) we mean that the position x is imagined to be occupied by the element ¢(P,*). 


73 
ae 
ty 


74 GARRETT BIRKHOFF [January 


THEOREM 4. The canonical transformation carries any analytical group G 
into a topologically isomorphic analytical group. 

Again, by definition, Pow: = Py (5s). While unless hu <0, =P), OF... 
whence 0 t(P,*). But 

(15y) Let R.=P.@P_.. Then t(R*) =0. 

Proof. Let denote the partition of R, into 2” equal segments, and set 
A=1/n. Then 

| ((R.)*)| =| 0 (Axo — Ax) o(— Ax)" | 
< | (Ax)""1 0 (— Ax)™"| + 2(| Axo — Az] ) 
by (2), substituting (Ax)"~! for x, (—Ax)"—" for y, (Ax o —Ax) for a and @ for }, 
and requiring x to be so small that M(|x| +|a|+]y|) <1, whence 
| xay| S| + M(| +] +] S| xy] +2] a]. 
But by induction on m and (14a), this yields 
| ((Rz)*)| = (m — 1)-0(| dx| ) + | Ax|) = m-0(| ) 
= n-M(|dx|)-|a|-| «| = x]. 
To complete the proof, let n>, so that M(|Ax|)—0. 

But if Au <0, ®P uz =P OP yz; hence in all cases UP 
=1t(P,*) 0 t(P,*), and so we have 

TueoreM 5. For fixed x, the t(P,*) are (locally) topologically isomorphic 
with the additive group of the d. 

But by Theorem 4 the canonical transformation is one-one; hence the 
function «* defined by making ¢(P,*) =x and «* =#(P,,*) is defined and single- 
valued near the origin. 

(156) at=x (by definition), 2 o and (since 
(x*)4 =a, 

(15«) x is a topologico-algebraic function of x, in the sense that any topo- 
logical isomorphism carrying x into y carries x* into y>. 

Proof. The assertion is true for positive integral \=m since (xo --- ox) 
=xl++:-+1=y", It is also true for positive rational \ since y= if and only 
if y=(y")/"=2"/"; while since x o y= if (by 15y)) and only if (by (98)) 
y =~", it is true for all rational \ = m/n. Finally, since the rationals are dense 


in the real continuum and x* depends continuously on X, it is true for all X. 
16. Canonical parameters. We are now in a position to introduce canoni- 


cal parameters. 
A group will be said to be under “canonical parameters” if and only if 


1938] ANALYTICAL GROUPS 75 


the canonical transformation T: x<i(P+*) is the identical transformation /: 


(16a) Any analytical group is transformed into canonical parameters by the 
canonical transformation—that is, the canonical transformation is idempotent. 


Proof. After T has been performed once, if x denotes the partition of (0, 1) 
into equal parts, then by definition and Theorem 5, ¢((P.)*) =(x/n)" =x, 
whence, passing to the limit, iteration of T leaves all points fixed. 


(168) Under canonical parameters, x* =x; hence scalar multiplication un- 
der canonical parameters is an intrinsic topologico-algebraic operation. 


Proof. By definition of x* resp. canonical parameters, «* =¢(P,*) =dx. 

(167) In any analytical group, x+y =lim) .o(Ax Ay)/d. 

Proof. Referring to the inequality (3), we get for fixed x and y since 
AMa-+y) =Ar-+ry, 
| Axory) — Me + y)| S 
Hence, dividing through by the scalar), 

| Ax ody)/d — + y)| S M([A]) 
which completes the proof. 

Combining (167) with (168), we get 


THEOREM 6. If G and H are any two analytical groups under canonical 
parameters, then any topological isomorphism between G and H is linear—it 
preserves vector sums and scalar products. 

Coroiary 6.1. The group of topological automorphisms of any analytical 
group is spatially isomorphic with a group of linear transformations of its param- 
etler-space. 


Coro.iary 6.2. If G and H are any two analytical groups, then any con- 
tinuous isomorphism between G and H can be expressed as the product of three 
transformations of the parameter-space of G, of types (11a), (11b), and (11a). 


CoroLiaRy 6.3. Admissible paths (cf. §13) are carried into admissible 
paths under topological isomorphisms between analytical groups. 


(165) Am analytical group is under canonical parameters if and only if 
xox=x+e for all x. 


t One should prove further: Any topological isomorphism between two groups whose function of 
composition is analytical, amounts to an analytical transformation of coordinates. To complete the proof, 
it would suffice to show that in such groups ¢(P,*) is an analytical function of x—a fact already known 
(from the theory of differential equations) for Lie groups. 


| 
| 
| 
ba 
fe 
| 


76 GARRETT BIRKHOFF [January 


Proof. If x o x=x+«, then by induction x* ~2"x, x=(2-"x)*", whence 
i(P*) =x, and we have canonical parameters. Conversely, under canonical 
parameters, x=x*=2x=x+4. 

17. Digression: Topologico-algebraic postulates. It is a curious fact that, 
by inverting the remarks of the last few sections, one can obtain topologico- 
algebraic postulates defining Lie groups, involving only intrinsic operations 
(i.e., operations invariant under topological isomorphisms). To show this, 
one need use only superficial reasoning, arguing from the above properties 
of canonical parameters. ft 

One can do this even for infinite continuous groups. The general procedure 
is: 1°: characterize Banach spaces topologico-algebraically (as those complete 
topological linear spaces possessing a convex open “bounded” setf) ; 2°: define 
linear transformations and thence Fréchet total derivatives (cf. §18) topo- 
logico-algebraically ;§ 3°: postulate that the group is a Banach space relative 
to addition under canonical parameters (“canonical addition”) and raising 
to scalar powers; 4°: postulate that an associative operation of multiplication 
satisfying x o «=x-+ and continuously differentiable on the Banach space, 
be defined. 

Because of the preceding results and Corollary 2 of Theorem 15, these 
postulates are satisfied by all analytical groups under canonical parameters. 
Conversely, by Theorem 8 any system satisfying these postulates is an ana- 
lytical group, which is by (166) under canonical parameters. 

In the special case of Lie groups—the case that the parameter space has 
a finite basis (or equivalently,|| is locally compact)—one can simplify these 
postulates to the requirements (i) elements a, --- , a, exist such that any 
element near the identity can be represented uniquely as a product 
oa, of small powers of the a,, and (ii) the function of com- 
position is continuously differentiable in (Ax, - - - , A,)-space. 

18. Digression: metric postulates. The present section will be devoted to 
sketching a proof of 


+ These ideas were announced in Abstract 41-5-192 of the Bulletin of the American Mathe- 
matical Society (1935). 

t For the terminology cf. J. von Neumann, On complete topological spaces, these Transactions, 
vol. 37 (1935), pp. 1-20. For the characterization cf. A. Kolmogoroff, Zur Normierbarkeit eines 
allgemeinen topologisches lineares Raumes, Studia Mathematica, vol. 5 (1935), pp. 29-33. 

§ Replace the usual epsilon-delta definitions by “for every given neighborhood there exists a 
neighborhood so small .” 

\| Cf. [1], p. 84, Theorem 8. 

{This can be phrased topologico-algebraically. For instance, xO y=f(x, y) has continuous first 
derivatives if and only if af/dx(a, b)=limy.0((a+Ax)Ob)/A and af/dy(a, 
exist and are continuous functions of a and b. 


1938] ANALYTICAL GROUPS 77 


THEOREM 7. One can redefine the class of analytical groups under canonical 
parameters by weakening the postulates for Banach spaces. 


This result will not be used elsewhere. 

Sketch of proof. Substitute “group products” x o y for vector sums x+y, 
and “scalar powers” x* for scalar products \x, continue to use an (extrinsic) 
norm function |«|, and make the following alterations in the usual postu- 
lates (cf. [1]) for Banach space (after first confining their validity to a small 
region about the identity): (1) replace the two conditions x+y=y+x 
and \(x+y)=Ax+dy by the single weaker condition |x 0 y* o (x0 y)>| 
<|d|-|x|-|y], and (2) replace the condition |x+y|<|x|+|y| by the 
weaker condition |x o y| <|«|+|y|+]|-|y]. 

The reader should have no difficulty in proving that the altered postulates 
hold in any analytical group under canonical parameters and under a suitable 
norm function (cf. Theorems 1 and 5 for the algebraic identities, and (278)— 
where it is shown that essentially | y) —(x+y)| <|«| -|y]—for the strong 
metric inequalities). 

But conversely, if one defines x+y o y)! and Ax in any 
system G satisfying the new postulates, then the space becomes a neighbor- 
hood of the origin of a Banach space B, and the map of G on B satisfies 
(1), (2’), (2’’) and x o x=x+a—completing the outline of the proof. 

19. Digression: differentiability postulates. We now come to the connec- 
tion between Definition 1 and differentiability conditions, namely 


THEOREM 8. Let G be any topological group nucleus, some neighborhood of 
whose identity e is mapped onto a region of a Banach space B, in such a way that 
xo y=f(x, y) has first total derivatives everywhere, which are continuous at e. 
Then G is an analytical group under the map. 


Proof. Theorem 8 is clearly meaningless until continuous total derivatives 
have been defined; actually, it refers to the usual definitions due to Fréchet.f 
Fréchet says that f(x, y) has a total derivative A with respect to x at x=a, 
y =b if and only if there exists a linear transformation A such that 


(18’) | f(a + x, b) — f(a, b) — Ax| So(| «|), 

where Ax denotes the transform of x by A. One similarly defines total deriv- 
atives with respect to y. Further, Fréchet calls the two total derivatives 
A(x, y) =df/dx(x, y) and B(x, y) =df/dy(x, y) continuous at x =a, y=b if and 
only if 


+ M. Fréchet, La notion de différentielles dans l’analyse générale, Annales de l’Ecole Normale 
Supérieure, (3), vol. 42 (1925), pp. 293-323. For a similar concept of an infinite continuous group, 
cf. A. D. Michal and V. Elconin, Abstract transformation groups, American Journal of Mathematics, 
vol. 59 (1937), pp. 129-144. 


| 


78 GARRETT BIRKHOFF [January 


| Bla + u, b + — B(a, b)y| S M(|u| +] 0])-| 


Clearly A(e, e) =B(e, e) =I, the identical linear transformation—since irre- 
spective of u,uoe=eou=u. 

Once these definitions and this fact have been stated, the proof of Theo- 
rem 8 follows familiar lines. Assuming the existence everywhere and continu- 
ity at x=y=e of A(x, y) and B(x, y), one constructs the real functions 


=| (Axoa) — Ax+a)|, 
v(u) =| (xoaopy) — (x +a+yy)|. 


Clearly (18’) implies that the upper right-derivatives of ¢(A) and ¥(u) are 
bounded by |A(Ax, a)—J|-|x| and |B(x o a, Ay)—J|-|y], respectively. 
Hence by the theory of real functions, 


(3) K({o] +] 2] +] 1), 


where K(|a| + |x| +|y|) is small as long as | A(Ax, @)—1| and | B(x 0 a, py) 
—I| are small identically on 0 <i, »<1—and so by the continuity of these is 
an M-function, q.e.d. 

Remark. Fréchet’s definition obviously specializes to the usual definition 
of continuous total differentiability when B is finite-dimensional—and is 
satisfied in this case provided continuous first partial derivatives with respect 
to all coordinates exist.t (This remark has immediate application to the 
theory of Lie groups—it shows that if the function x o y=f(x, y) has continu- 
ous first partial derivatives, then one is dealing with an “analytical group.”) 

In summary, §§$17—19 have contained three alternative definitions of ana- 
lytical groups, equivalent to Definition 1. One can view these from two angles. 
They may be regarded from a conceptual angle as giving a better picture of 
what an analytical group is. Or they may be regarded as giving content to 
Definition 1 itselfi—that is, as furnishing examples of analytical groups from 
other contexts. 

CHAPTER III. LINEAR GROUPS 


20. Axiomatization. It is a simple fact, that one can axiomatize algebras of 


linear operatorst on Banach spaces. 

To see this, one must first recall that the operators on amy linear space B 
which are defined everywhere, and carry vector sums into vector sums and 
scalar multiples into scalar multiples, constitute a hypercomplex algebra with 


+ C. J. de la Vallée-Poussin, Cours d’ Analyse Infinitésimale, Louvain, 1914, p. 141. 
t By a “linear operator,” we mean ([1], p. 23) any continuous additive, everywhere defined func- 
tion. This conflicts with the usage for Hilbert spaces, where such operators are called bounded. 


1938] ANALYTICAL GROUPS 79 


a principal unit J. (We shall use the notation O for the transformation carry- 
ing every xeB into 0; J for the identity xx; S, 7, U,--- for other opera- 
tors.) 

One must next observe that if B is a Banach space, then relative 
to vector sums 7+U, products AT with scalars, and the “modulus” 
||7'|| =sup.,~e| Tx|/|x| (cf. [1], p. 54), the linear operators on B constitute 
another Banach space. The proof of this will be left to the reader.f 

Finally, ||7 U|| 

More generally, any algebra of linear operators on a Banach space B which 
contains J and is topologically closed (under the “uniform” topology defined 
by the metric || 7—U]||) has all of the properties just described. 

But conversely, let S be any system having these properties—i.e., any 
“metric hypercomplex algebra.”{ Then (applying a classical construction) 
each element Je induces a linear transformation 67: X—>XT on the ele- 
ments Xe. Moreover since ||J7|| =||J||-||Z|] and ||X7|| the 
“modulus” of 67 is precisely ||7||. Thus © can be realized as a closed algebra 
of linear operators on itself, including the identity 0;: X-XJ=X. 

21. Linear operators with inverses. Linear operators do not constitute a 
group under multiplication. But the linear operators S with inverses S-! 
satisfying SS-!=S-1§=J do. And one can easily prove 


THEOREM 9. Let © be any metric hypercomplex algebra. Then the map 
(I+T)—T of the elements (I+T)e® with ||T|| <3 onto the linear space defined 
by ©, exhibits these elements as an analytical group © under multiplication. 


Proof. Refer to Definition 1. The only properties in any doubt are (2’)- 
(2’’). But 
Xo +¥)- Xo +2Z)] - -2Z]]| 
= — XoZ]|| =||Xo(¥Y —Z)||, by algebra, 
-||¥ — by hypothesis, 


Il 


IIA 


proving (2’). One obtains (2’’) similarly. 
THEOREM 10. The canonical transformation of © is given explicitly by the 
convergent power series 


1 1 
= (PA). 


t For instance, if {7,,} is a fundamental sequence of linear operators, then for any x, { Tnx} isa 
fundamental sequence in B, whose limit we shall define as Tx. By continuity, T(x+y) =7Tx+Ty and 
| Tx—Ty| . 

t More properly, any metric associative hypercomplex algebra. Omitting the associative law, we 
get a more general definition (cf. §30), which however yields no realization theorem. 


i 
4 
} 
* 
us 
ig 


80 GARRETT BIRKHOFF [January 


Proof. The questions of convergence are settled by the inequalities 
|| 7*|| <||7||" and || 7+U}| <||7|| +||U/], and the assumption ||7|| <3. But now, 
dividing Pr into n equal parts, we get by the binomial expansion 


1 ad 1 1 
nN n2 n3 


which converges (cf. supra) to exp(7) —J. 
The inverse of the canonical transformation is of course given by the 
power series 


T<log 7+ 7) = 


but we shall not use this fact.f 

22. Generalization of Theorem 9. A metric algebra ¥ need not possess 
a unit J nor satisfy ||7|| =1 in order that the symbolic elements +X with 
XeX and ||X|| <3} should form an analytical group nucleus when multiplied 
according to the rule 


(22.1) (I+ 


The arguments of §21 do not involve these assumptions. 

An important example of such an algebra is due to Delsarte [6], and is 
also cited by Yosida (op. cit.). It is the algebra % of all infinite matrices 
A =)la;,|| for which ai;|2< +. If we set @:;|2, we have a 
Banach space, in which products C=A 0 B=\lc;,|| can be defined by the 
usual rule c;;=a;.b,; —the series being convergent by Schwarz’ inequality, 
and satisfying, besides, ||C|| <|{ Al] -|| Bll. 

The algebra % corresponds of course to the algebra of Schmidt kernels in 
the theory of integral equations, and is isometric with Hilbert space. 

23. Function of composition. The formulas of the preceding section lead 
directly to explicit expressions for the function X o Y of composition. 

Under the original parameters, X o Y = F(X, Y) is analytic since (by the 
distributivity of multiplication) it is linear in both variables—and among the 
functions between linear spaces, next to constant functions, linear functions 

+ Remark: The above treatment was suggested by that of J. von Neumann [12]. The main 
changes are: explicit discussion of transformations as abstract elements, and use (following Banach) 
of the “modulus” || X|| for norm. 

The concept of a metric hypercomplex algebra (“complete normed vector ring”) was announced 
by the author in Abstract 41-3-104 of the Bulletin of the American Mathematical Society (1935); 
a similar definition is given by K. Yosida (On the group embedded in the metrically complete ring, 
Japanese Journal of Mathematics, vol. 13 (1936), pp. 7-26). Yosida does not require ||J|| = 1; cf.§22. 

Another example, discussed at length by M. H. Stone, consists of the linear operators 7,: 


f(x)—f(x)a(x) on the space of bounded functions on an abstract class. This is a closed subalgebra of 
the algebra of §20. 


1938] ANALYTICAL GROUPS 81 


are the most purely analytic. Thus the second partial derivatives of F(X, Y) 
are constant and so the higher derivatives all vanish identically. 

Moreover by Theorem 10, if we denote by X”" as usual Xo --- o X, then 
we have the following explicit expression for X o Y =G(X, V) under canonical 
parameters, 


G(X, Y) = log (J + F(exp X — I, exp Y — J)) 


1 
= log (1 PLE, 
m=l1n=1 


k 
=> > —_ F(x", 
k=1 m=ln=1 

whose first terms can be found easily, and are monomials. 

It has been shown by J. E. Campbell [3] and F. Hausdorff [8] that this 
series can also be developed in terms of X, Y, and iterations of the bilinear 
function [X, Y]=XY—YX. The resulting “SCH-series” will be proved in 
Chapter V to be valid also for non-linear analytical groups. 

24. Digression: polynomials and analyticity. The algebraic significance of 
the SCH-series will be discussed in Chapter V; what about its analytical sig- 
nificance? 

It exhibits G(X, Y) as analytical in the strong sense that (1) it is the limit 
of an absolutely convergent series of polynomials of increasing degrees, } 
(2) its derivatives all exist and can be found through term-by-term differen- 
tiation of the series, (3) hence the Taylor’s series for G(X, Y) converges ab- 
solutely to G(X, Y)—all within a sphere of positive radius. 

Although it is not entirely clear when a function between Banach spaces 
is “analytical”—there may be various generalizations of the established no- 
tion for functions between euclidean spaces—it seems undeniable that at 
least any function with properties (1)—(3) should be called analytical. 

25. Adjoint of an analytical group. In the present section, we shall show 
that the notion of the adjoint of a Lie group can be extended without real 
modification to the case of analytical groups. We state this more precisely in 
the following theorem: 


+ For polynomial functions between Banach spaces, cf. S. Mazur and W. Orlicz, Grundlegende 
Eigenschaften der polynomische Operatoren, Studia Mathematica, vol. 5 (1935), pp. 50-68 and pp. 
179-189. One can define polynomials through continuity + the identical vanishing of (n+-1)st differ- 
ences, through the identical vanishing of (n-+-1)st derivatives, or as sums of multilinear functions in 
a variable repeated 0, - - - , 2 times; and these definitions are equivalent. 

Unlike these authors, we are concerned with functions of two variables. N.B.: A polynomial 
function on r variables which is homogeneous of degree k in each, is homogeneous of degree kr (and 
not of degree k) on the product-space of the variables. 


4 
be 

on 
ia 
‘4 
| 


82 GARRETT BIRKHOFF [January 


THEOREM 11. Let G be any analytical group under canonical parameters. 
Then each element geG determines a linear transformation 0,: x—>g-1xg on the 
parameter-space of G, and the correspondence g—0, is continuously homomor- 
phic. 

Proof. Since G is a topological group @, is a topological automorphism. 
Hence by Corollary 6.1 it is a linear transformation on G. Moreover by the 
well-known identity (gh)-'x(gh) =h-(g-1xg)h, the correspondence g—0, is 
homomorphic. It remains only to show that it is continuous under the wni- 
form topology. But 

| — g-'xg| = O(| by (3), 
= O(| x] 
= O(| 2x) 
(since ||6,|| = O(| g|), by (148)) 
=O(|g—h|)-| by (278), 
whence ||#,—4,|| =O(| g—A| ), completing the proof. 

The validity of the proof of course depends on proving (278) without the 

aid of Theorem 11. We shall do this in §27. 


The correspondence g—6, does not always carry open sets into open sets: 
it need not be “gebietstreu” in the sense of Freudenthal. 


CHAPTER IV. COMMUTATION 

26. Outline. The present chapter will be devoted to showing how every 
analytical group G possesses a bilinear “commutation function” [x, y]. In 
Chapter V, it will be shown that [x, y] determines G to within local iso- 
morphism. 

The commutation function [x«, y] belonging to a given group G is most 
easily defined as the bilinear asymptote at x =y=0 to the purely algebraic 
commutator function 


(x, y) = K(x, y) = xty"txy. 


The fact that (x, y) has a bilinear asymptote is proved below (in §28) from 
the relations (deduced in §27) 
{| (uo x, y) — (x, 9) — »)| +] 

| (x, v0 — (x, — (x, »)| = M(| 
(278) (x, »)| = 
while the fact that [x, y] is a topologico-algebraic invariant associated with G 
is almost obvious (cf. §29). 


(27a) 


1938] ANALYTICAL GROUPS 83 


Moreover one can deduce the familiar identities 


y] + Ly, «] = 0, 
[[x, 2] + [Ly, 2], «] + [lz, x], y] = 0 


as corollaries of formal identities on group products. These results can be 
summarized in the statement that G possesses a metric Lie algebra L(G). 
Chapter IV concludes with various applications of L(G), to the case that G 
is under canonical parameters. 

In the proofs of Chapter IV, group algebra plays a novel and essential role. 

27. The approximate bilinearity of K(x, y). The present section will be 
devoted to showing that (x, y) = K(x, y) is approximately bilinear at x = y=0, i 
in the sense that (27a)—(278) are true. 

The proof of (27a) is almost immediate. One has the formal identity % 


(30a) 


(wo x, y) = uxy 
(u, y)20 (x, y) 


under the convention that g, denotes x—!gx. But by the fundamental inequal- 
ity (2) of §8, H 


(278) | @, (u, »)| #|)-|@, ») I, 
whence | (u, y)z| =O(| (uw, y)|). Hence 


(277) 


Il 


Z=|(uox, y) — (x, y) — y)| 
<| (uo x, y) — (x, — (u, y)2| +| — 
< (a, y)|)-1@ 21+ MC 1 @, 9) | 
(by (277) and (3) of §9, and (276)) og 
since | (w, y)z| =O(| (uw, y)|). But this is the first half of (27a); the second half 


follows from the symmetry between left- and right-multiplication. 
As a special instance of (27a), we have 


| (om, 9) — 9) | MC +] 

Hence, since |x™| =O(|mx|) =O(m|x|), by induction 
| — y)| MGm| +] y|)-m- | (x, 9). 

Combining with the symmetric formula in (x, y"), we get if 


(276) (u™, y") — mn(x, y)| = M(m| «| + y|)-mn| (x, y) |. 


IIA 


Consequently, within some small radius p of the origin, 


4 

| 

ii 


84 GARRETT BIRKHOFF [January 


(27¢) | (x, y)| OC| (x™, | /mn). 

But clearly within this sphere, given «~0 and yO, one can so choose m 
and » that 3p <|x™|, | y"| <p—whence, | (mx, my)| being bounded within this 
sphere, we get | (x, y")| <O(| mx| -|ny|), and so by (276), 

(278) | (x, »)| 


It isa corollary, since yoxo K(x, y) =xo y, and likewise (yox)+(xoy—yo x) 
=x 0 y, that (by (3)) 


(278’) | xy — yx| O(| x] -| y]). 


28. The asymptote [x, y]. Substituting from (278) in (27a), and recalling 
that x+w=w o x impliest | w| ~| we get 
(28a) {| («+ w, y) — (x, y) — (w, S$ M(| +] 
| (x, + w) — (x, y) — (x, w)| = M(| «| +] 2] al, 
from which there follows 
(28a) | (x + u, y+) — (x, y)| S$ u| +] 2]). 


Now start anew with (28a)—(28a’), and use the same algebraic analysis 
used in proving (27). By (28a), 


| (mx, y) — ((m — 1)x, — (x, »)| M™m| «| +] 
Hence by induction on m, we get 
| (mx, y) — m(x, y)| M(m| x| +|y|)-m| x] -| 
Combining with the symmetric formula in (x, my), we have 
(288) | (mx, ny) — mn(x, y)| M(m| x| + n| y|)-mn-|x|-| 
By double use of (288), we get for 0<h/m, k/n <1, 


h k hk hk 
(—+—y) (x, 9) | s MC +1 
mn 


m n mn 


whence, by rational approximation and passage to the limit, using (28a’) to 
establish continuity, we have 


1 


for u<1. Therefore if \A+u+A’+y’ <e, then 


t By |w|~|«| we mean that| |w| |<M((| x! +| +|w|)- | «| ; this relation is evidently 
reflexive, symmetric, and transitive. 


ANALYTICAL GROUPS 


1 1 


and so, by the completeness of the parameter-space 


1 
(286) [x, y] = lim — K(e, py) 
Awlo Au 


exists. Furthermore, by (287), 
Finally, since (—x+<2, y) = (0, y) =0, by (28a) 


«, y) — [- »]| MC 4] 


whence we see that 
(28¢) 


exists. 

29. The bilinearity of |x, y], etc. In this section, we shall prove the bi- 
linearity and topologico-algebraic invariance of [x, y]. 

The invariance of [x, y] under continuous isomorphisms between groups 
under canonical parameters follows from the definition and Theorem 6. And 
by (28a), “distortions” of type (11a) change (Ax, wy) by o({A| -|u|), from 
which invariance under general continuous isomorphisms follows by Corol- 
lary 6.2. 

As for the bilinearity of [x, y], by (28a) 


1 1 
= = + wy) — = (Ax, wy) — = (Au, ny) 


1 
= M(|dx| +] wy] )-—- | - | 
Nu 


whence, passing to the limit, [x-+u, y]—[x, y]—[u, y]=0. Hence [x+u, y] 
= [x, y]+[u, y]; and [x, y+] = [x, y]+ [x, 2] by symmetry. Also, by (278) 
and (28e), [x, y]=O(|x] -|y|), and so is bounded. Hence it is bilinear. 

In summary of the above results, 


THEOREM 12. (x, y) has a bilinear asymptote |x, y| which is a topologico- 
algebraic invariant of G. 


Remark. In a linear group, algebra based on the expansion (J+)X)-? 


--- shows 


1938] 85 
is 
i 
1 
[x, y] = lim) (x, wy) 
| 
4 
3 
} 
i$ 


GARRETT BIRKHOFF 


1 
rr) (AX, AY) = (XY — YX) + terms of higher order 


whence, passing to the limit, [X¥, Y]=XY-—YX. 
30. Metric Lie algebras. One can now deduce relations (30a) from alge- 
braic identities on group products. 


In the first place, since (y, «) = (x, y)—1, and u-!+-u is nearly zero, clearly 
[x, vy] +[y, x] =0. That is, [x, y] is skew-symmetric. 

The proof that [x, y] satisfies Jacobi’s identity, 

y], 2] + [Ly, 2], «] + [lz, x], =0 
is less simple. It depends very essentially on realizing that by (278) and (286e), 
= | ((x, y), 2) [[x, y], 2] | 
(308) = | ((x, y); 2) ([x, y], z) | + | ([x, y], z) [[«, y], 2] | 

and, besides, on remarking that since »v 0 u=u 010 (v, u), to permute two 
commutators in a group product changes the value by an amount which 
is by (278) small to the fourth order. 

But direct computation based on cancellation provest 

(x, y)((x, 2)(z, *)((z, »)(y, ~)(y, 2)((y, 2), 2) = 0. 
Therefore, permuting terms, and cancelling 
(x, y)(y, = (2, »)(y, 2) = (, x)(x, 2) = 
we get by the preceding remark, the inequality 
| ((x, z)((y, z), x)((x, z), y) | < 0 | +| y| +|:| | 2 | | | | z|. 

Hence by (308) and the fundamental relation (3), 
Replacing x, y, 2 by Ax, Ay, Az where J is small, and using linearity, we get 


Jacobi’s Identity in the limit. 
Summarizing, we may say (in the language of Chapter III), 


THEOREM 13. Relative to sums x++, scalar products dx, and “brackets” 
[x, y], the parameter-space of any analytical group nucleus G is a metric Lie 
algebra L(G). 

Remark 1. In §§26-30 we have nowhere assumed that G was under ca- 
nonical parameters. 


t This formula was suggested to the author by identities in §2.3 of [7]. 


86 [January 


1938] ANALYTICAL GROUPS 87 


Remark 2. Since | [x, y]| <O(|x| - ||), after changing the scale (i.e., mul- 
tiplying the norm by a suitable constant) we can assume simply | [x, y]| 

Remark 3. Brackets [x, y] are defined for all x, y in the Banach space 
B, unlike x o y which is defined only locally. 

We shall show (Corollary 15.1) that G is determined to within local iso- i 
morphism by L(G), and that conversely any metric Lie algebra belongs to an 
analytical group nucleus. This shows that the problem of enumerating the 
analytical group nuclei with a given parameter-space is equivalent to that of 
enumerating the different metric Lie algebras on the same linear space. 

31. Subgroups and normal subgroups. The results of §§31-32 will refer 
to analytical groups G under canonical parameters. A subset S of elements of 
G will be called an analytical subgroup nucleus if and only if, relative to the 
topology and group multiplication table of G, S is itself an analytical group 
nucleus. An analytical subgroup nucleus S will be called normal (or invariant) 
if and only if for every geG, g-1Sg contains some neighborhood of the identity : 
of S. 

If S is an analytical subgroup nucleus, then each xeS must lie on a one- u 
parameter subgroup x* in S, and hence (by Theorem 6) a segment Ax in G : 
must lie in S. Again, the length of this segment must exceed some fixed posi- 
tive constant; otherwise we could find {x,} such that \,2,¢S implies \,«,—0, - 
and this is impossible in an analytical group nucleus. ie 

Therefore S must contain with x and y, k(Ax, Ay)/d? for some fixed k>0 1, 
and all \ on [0, 1]; hence it must contain with x and y, k[«, y] (since, be- ie 
ing complete, it is closed in G). Similarly, it must contain with x and y, 
x+y=lim,.o(Ax o Ay)/A. And finally, if two such subgroup nuclei contain 
elements on the same class of segments \xeG, then they clearly generate (in a 
case G is a group) the same subgroup of G, and so may be identified. 4 

% 


These facts may be summarized in 


(31a) Let G be any analytical group nucleus under canonical parameters. 
The analytical subgroup nuclei S of G are pieces of closed subalgebras of the ‘8 
metric Lie algebra L(G), two subgroups being identical if and only if the subalge- 4 3 
bras are. 


If S is “normal” (i.e., invariant under all inner automorphisms), then xeS 
and geG imply that for some k>0, x] =limy.o(R {Ax Agie} /A*)eS. Fur- 


thermore, 
1 1 
n n 


GARRETT BIRKHOFF [January 


NG) 
\\n n n 
egS = Sg. 


Hence g+S =Sg, and 


(318) If S is normal, then the associated subalgebra of L(G) is invariantt 
and the cosets of S are the hyperplanes parallel to the manifold of S. 


We shall prove converses to (31a@)—(318) in (328) and corollary 15.5. 

32. The adjoint group. Using the commutation function, one can easily 
deduce an explicit series for the adjoint group of §25. 

Define T: u-T(u) =(y/n)-! 0 u o (y/n). Then T is a linear transforma- 
tion, and 


1 1 1 
| T(u) — + [—y])| < vo(u,—y) (u.—y) 
n n n 
1 
n n 
1 1 
(|u| +—| yl al 
n n 


(by (3), (278) and (28e)). 


Hence by n-fold iteration and the binomial expansion, 


1 
T"(u) — + [u, + Cua [[u, vy], +-- | 


1 
(|u| yl. 


Whence, since 7"(u) = y~! 0 uo y, passing to the limit, we have 


1 
| w(x, y) | {u+ lu, [[u, y], 
= o(| u|)-| 


But since the terms are all linear in u, clearly 


| w(u, y)| = = n-o(1/n)-| - | |. 
n 


That is, letting n T ©, | w(u, y)| =0, and so 


+ In the usual sense, that xeS and geZ(G) imply [g, x]eS. 


_ 


1938] ANALYTICAL GROUPS 89 


(32a) yxy = « + [x, y] +5 [[x, y], vy], yl], 9] 


From (32a) we deduce as a corollary, 


(328) If the subalgebra associated with a given subgroup S of an analytical 
group G is invariant, then S is a normal subgroup. (Converse of (318).) 


CHAPTER V. FUNCTION OF COMPOSITION 


33. Introduction. The main purpose of this chapter is to show in §§34-36 
how the function x o y=f(x, y) of composition of any analytical group under i 
canonical parameters, can be written as the sum of an infinite series of poly- i 
nomials determined by the commutation function [x, y]—and to deduce in 
§37 various corollaries from this fact. 

F. Schur [16] first showed that this series was valid in all groups under 
canonical parameters. Campbell [3,] and Hausdorff [8] have since obtained 
it by other methods,f{ and so we shall call it the “SCH-series.” 

The present exposition is preferable on three grounds to those cited. It 
applies to infinite-dimensional groups. It paraphrases identities on pure group 
products [§§38-40], and does not require Taylor’s series or manipulations 
with matrix polynomials (which are unnatural in non-linear groups). And i 
most important, it generalizes easily to yield similar series expressing the 4 
definite (product) integrals over fixed time intervals of variable linear com- 
binations of infinitesimal transformations, in a form which (like the SCH- 
series$) is independent of the group which they generate. 

In §§38—40, the paraphrases in terms of identities on group products, of % 
the SCH-series and other identities in the theory of continuous groups, are 3 
developed. They are not a part of the technical argument—unlike the para- 
phrases of the identities of Lie-Jacobi, which are actually used in proving the ie 
latter. They have been included because they correlate the theories of dis- a 
crete groups and continuous groups in a way essential to the full understand- : i 
ing of either. a 


+ Expressions (x, y) or [x, y] will be called “simple” commutators and brackets, respectively; 
the commutator (¢, ¥) of any two commutators ¢ and y of “lengths” w(¢) and w(y)—where for uni- 
formity individual letters are regarded as commutators of length one—will be called a “complex” 
commutator of “length” w(¢)+w(y). Similarly with complex brackets [¢, y] of “length” w(¢)+w(y). a 

t Schur starts with the obvious identity f(x, (A+4)y) =f(x, Ay) Ody, determines d/dd{ f(x, dy) } 
from this, and integrates the resulting differential equation. Campbell and Hausdorff develop the 
series by setting ete” =e), and use the algebra of matrices to solve for f(x, y)=log [1+(ee”—1) ]— 
thus introducing an extraneous operation of addition. 

§ The SCH-series is the case where x operates first for a unit of time, followed by y operating 
for a unit of time. 


a 

fe 

4 

| 


90 GARRETT BIRKHOFF [January 


34. Product-equivalence of paths. Let G be any analytical group under 
canonical parameters. Then 


(34a) The problem of determining x o y=f(x, y) is equivalent to that of de- 
termining, given two short paths P and Q, whether or not t(P*) =t(Q*). 


(We shall express the relation ¢(P*) =¢(Q*) by writing P~Q, and saying 
P is product-equivalent to Q. By (140), P~Q implies P~Q.) 

Proof. Since x o y=t((P.@P,)*) under canonical parameters, we have 
found the z=f(x, y) whent we have found the P.~P, @P,. While conversely, 
i(Q*) is approximated arbitrarily closely and hence determined by the 
t(0**)=2(Q:) o --- of(Q,) for the different partitions + of Q—and the 
t(Q:) 0 --+- of(Q,) are determined by Q and the function of composition. 

From (34a) and the known existence of an SCH-series expressing f(x, y) 
in terms of the commutation function, we certainly can infer that ¢(Q*) is 
determined by Q and the commutation function in a way valid in all groups G 
under canonical parameters. But it by no means gives us explicit series for Q* 
(except when 0 =P,@P,)—and it is such series that we shall finally obtain, 
getting the SCH-series as a special case (cf. §36). 

Our first step will be to determine, given Q, all the P~Q. To this end we 


prove 
(348) Let P and Q be any admissible paths with domain [0, A]. Then P~Q 


if and only if some U: u(d) exists, such that u(0) =u(A) =0 and 
| 6p — [w-(A) 0 6g 0 u(A) + Sut] | < o(| AQ| +] AU] ). 

Proof. Suppose P~(Q, and write p*(A) =q*(A) ou(A). Since p*(0) =0 =q*(0) 
and p*(A) =¢(P*) =t(Q*) =q*(A), u(0)=u(A)=0. Define R=P*, so that 
P=Ry. Clearly if A: [\, w] is any interval, then by (147) 

t((AP)*) = ¢((ART)*) = o r(x) 

= o [g* (A) 0 g*(u)] 0 u(A)} 0 u(u)} 

= {u-(A) 0 t((AQ)*) 0 u(A)} 0 #((AUF)*). 
But |AP| = O(|AR|) < O(|AQ*| + |AU|) < O(|AQ| + |AU|); besides 
|((AP)*)—«(AP)| <o(|AP]|), and similarly with AQ and AU (by (146)) 
even after the inner automorphism induced by wu(A). Moreover by (3) 
|x o y—(a+y)| =0(|x|+ consequently if we write 5p=¢(AP), 
=t(AQ) and 6u=t(AU), we get 


| 6p — [w-(A) 0 u(A) + Sut] | = o(| 4Q| +] 


t We recall the notation P, for the path p,(A) =x defined on [0, 1], and P,@ P, for the broken 
line R: r(A) =Ax on [0, 1], and r(A) =x+(A—1)y on [1, 2]. 


1938] ANALYTICAL GROUPS 91 


Conversely, suppose that this inequality is satisfied for some u(A) (of bounded 
variationt) with «(0) =u(A) =0. Then, when we write r(A) =qg*(A) u(A), ob- 
viously ¢((RT)*) =¢(R) =4(Q*). Moreover by the argument above, rf (A) satis- 
fies the given inequality. Therefore by the triangle inequality, 


| 5p — M(| AQ| +| AUV|)-(| +] AU] ). 


Hence if x is any partition of [0, \], writing ||7|| for sup (|AQ| +|AU|), and 
summing inequalities, we get 
+1 71), 


whence in the limit p(A)=r7 (A). 
We can rewrite (348) perhaps more suggestively in the notation of differ- 
entials, as 


dp = u“(A) 9 dgo + duf. 


35. Devices for calculation. Consider the terms of this formula. By (32a), 
uo dqgou can be calculated explicitly from U, Q and the commutation 
function. 

Again, although we have not shown how to calculate Uf from U ex- 
plicitly§ by using the commutation function, we can now do so in case U 
is unidimensional. 

(A path U: u(A) will be called “unidimensional” if and only if it is con- 
fined to a straight line—i.e., if and only if for some mo, (A) =a(A)mo. If U is 
unidimensional, then by Theorem 5, U* =U,}=U identically, whence in the 
limit U* = Ut =U. By a “unidimensional alteration” of any path Q with do- 
main [0, A], will be meant any path P=Rf determined from an R: 
=g*(A) o [a(A)uo] for which a(0) =a(A) =0. In this case, clearly P~Q 
and furthermore by (34y), 

(35a) dp = u(A) + du. 


And so P is determined by Q, u(A) and the commutation function.) 

Since (32a) gives an infinite series in any case, the fact that only unidi- 
mensional alterations can be computed explicitly suggests the following pro- 
cedure: decomposing a given Q into undimensional constituents, altering 
these one at a time, and justifying the computations by proving general prop- 
erties of paths represented by infinite series. This we shall do, first proving 

t Le., such that the curve U: u(A) is rectifiable. 

§ N.B.: Uf differs from U by M (| U|)—and hence one can deform a given Q little by little into 


any desired shape (e.g., a straight ray), whose final position will be determined by Q and the commuta- 
tion function. But its calculation involves integrating a (highly involved) differential equation. 


4 
i 
: 


GARRETT BIRKHOFF [January 


(358) Let we(d), us(A),--- be any twice differentiable functions 
with domain [0, A] and values in a Banach space. Suppose that the 
sup | =suposasa|u.(A)|, the sup and the sup |uz’| all form con- 
vergent series. If [\, is any subinterval of [0, A], u(d) denotes _,ux(d), 
and denotes ux(A+dd)—u,(d), then 


iu — an] 0) | < O(| < o(| dd] ). 


Remark. It is a corollary that u is differentiable and has >-y_,w (A) for 


derivative. 
Proof. Since by the comparison test, all the series involved converge ab- 
solutely (and uniformly!), and the terms of absolutely convergent series can 


be permuted, clearly 6u =>"; du z- Moreover for every k, 
| — 3[ sup | |]-dr?. 


Summing, we get by the triangle law 


bu — uf 0) | < dy >> sup | ux (A) | + dr?- >> sup | uf’ |. 
k=1 k=1 


n+l 


When we pass to the limit, this becomes 


> 0) | | < sup | uf’ | < aly. 
k=1 k=1 


It will be convenient to signify that the hypotheses of (358) are satisfied 
by writing 


U=U0,+0:+ 


We shall now get a path R{~P,@P,, from which we shall be able to cal- 
culate f(x, y) by using an algorithm applicable to all analytical combinations 
of unidimensional paths. (The analyticity of P.®@P, is concealed.) 


THEOREM 14. Let R: r(\) =Ax © dy be defined on [0,1]. Then P.@Py~Rf. 
And (assuming | [x, y]| < |x| -|y| by §30) if |x| +| <1/10, then 


HO) = dy tat [x + [le + = 50). 


Proof. It is obvious from identities established in §14 that 


® Py)*) = UP.) ol(Py) = xo y = UR) = #((Rf)*). 


92 

k=1 


1933] ANALYTICAL GROUPS 93 


The proof is complete if we can show that | 6r¢—4s| <o0(|dd|). For if this 
is so, then the upper right-derivative of |r{(A) —s(A)| is zero everywhere, and 
so rt(A) =s(A). But by (358), ds differs from dA{y+x+d[x, y]+--- } by 
o(|d\|)—and this is by (32a) Again, by (3) 
dd{y+yo x o y*} differs from 


= 0 y® 
by M(|dd| -|x])-|dA| -|y| <o(|dA]). And by (14€) we have | ¢((6Rt)*) — 6rt| 
<o(|dd| )—completing the chain of links of length o(| dd|) between és and 6r, 


and hence the proof. 
36. Evaluation of regular paths. We can now find f(x, y) =¢(R) =t((RT)*) 7 


by a process which enables one to find series expressing ¢(P*) for any short | 
path P which is “regular” in a sense defined below. ie 

Accordingly, let G be any analytical group under canonical parameters, in ' 
which a scale of length has been so chosen that [x, y]<|2| -|y|. Let P be i 4 


any path in G which can be written 


P=Pi\+P2:+P3+.--- (in the sense of (358)), 
Pi: pd) = (0=\ 1), 


where (1) the p;(A) are analytical scalar functions with >>;_,f|dp:| <1/10, 
(2) the 6; are brackets in elements x, - - - , x, arranged in order of increasing i 
length, and containing with any and also [b;, Such a path 
will be called regular. } 

Remark 1. By inserting dummy terms 0-);, one can make any sum of i 


scalar multiples of brackets in x, - - - , x, satisfy (2) simply because the num- 
ber of different brackets of any preassigned length w in 4, - - - , x,, is finite. 
Remark 2. If |x| + |¥| is small enough, then the r{(A) of Theorem 14 is 
regular. 
Remark 3. Since | [x, y]| <|x|-|y], |b;| <1 identically if |x| <1,---, 4 
|x,| <1. 
TueoreM 15. Let P be any regular path. Then t(P*) is >-;_ yyibi, where each 
can be calculated from p,(d), - - - , pi(d) in a finite number of rational opera- 
tions, integrations, and differentiations. The calculations are independent of G. 2 


Outline of proof. We shall construct paths P’’~P’=P, P’’~P”, 
by successive unidimensional alterations. Each P’t' will 
be “regular” in the same sense that P is, except that 1/10 may be replaced ; 
by some other constant <1/5. Moreover the p?+'(A) for i<v will be of the 


| 

| 

aq 
4 
4 
ke 


94 GARRETT BIRKHOFF |January 


form \y;—where 7; is independent of »—and the p?+*(A) for i>» will be in- 
creasingly negligible—whence t(P*) vi: bi. 
Definition of P’+' by induction. If one sets 


u,(d) = [dp,"(1) — p.”(A)]-b, = B,(A) 


and can obtain a P’+! ,p?+(A) -b; from P” through unidimensional alter- 
ation by u,(A), then assuming the term-by-term differentiability of all series, 
by (32a) and (35a), we obtain heuristically 


(*) dp = dp + dB,(r)b, + [OT 
1 


i,k= 


where [b;, and = But clearly i(j, k) =i has in 
no case an infinity of solutions (j, ). Hence we can certainly define 


v+l1 1 . v 
i(j,k)=i k! 


with the assurance of obtaining analytical p?+'(A)—and using only rational 
operations, integration, and differentiation. 

Actual proof. Let us do this. Then—since the length of no bi;;,4) exceeds 
that of b,—certainly by construction p/+1(A)=Ap? (1) and for 7<», 
p?*1(X) =p? (A) =A7; by induction. Furthermore 

(36a) The series (*) converge in the sense of (358). Consequently (collecting 
terms) in the same sense. Moreover {| <1/5. 


Remark. They even converge absolutely if we replace each bracket by 
the product of the absolute values of its entries. 
Proof. If (A), B(A) and p(A) are amy real analytical functions, then cer- 


tainly 
sup |o| |do| = f | 


k 
s| fia || fie |. 
k 
sup |o'| f | -sup 
k-1 k 
sup |o”| f | -sup +[ f | aa! -sup | 


(differentiation is indicated by superscribing primes). Hence by induction on 
v,—since and < + & if <1—the series (*) con- 


1938] ANALYTICAL GROUPS 95 


verges in the sense of (358). Moreover (since grouping terms never increases 
sums of absolute values) for the same reasons ).,.,/|dp?+"| (which bounds 
>-:-:sup |p?*'|) does not exceed the corresponding sum for P’ by a propor- 
tion of more than /|dp?|/(1—J|dpz|). And by induction this is at most e 


5/| dp? | /4. Consequently 
i=v+1 5 \4 
and 


3 
flail -— f 
v v-1 
fla) s + f 
i=1 


t=1 


lA 


But four-thirds of the first sum, plus the second sum, is non-increasing as 
v | «—whence the second sum is always bounded by and the first 
tends to zero. 

This proves (36a). Hence (regrouping the terms of (*) through (32a)), by : 
(32a) and (35a), P’t!~P’~P. And since by inequalities just proved, 
tends to zero as v increases, #(P*)=t((P’)*) 

This completes the proof of Theorem 15. 

37. Corollaries of Theorem 15. Theorem 15 has several immediate corol- 
laries of primary theoretical importance. We shall list some of these now. ! 


Corotiary 15.1. One can write f(x, y) as the sum of an infinite series of ‘ 
scalar multiples of brackets of x and y arranged in order of increasing weight; ‘§ 
each coefficient can be computed after a finite number of rational operations, and ) 
are rational numbers. 


Proof. In Theorem 14, rf (A) is (cf. Remark 2 above) a regular path whose + 
pi(A) are polynomials (of degree at most the length w(d,) of 6;) with rational 4 
numbers as coefficients. These properties are preserved under the rational 
operations, differentiations, and integrations performed above—any poly- 
nomial can be differentiated or integrated by rational operations on its co- 
efficients. 

(The reader will find it instructive to compute the terms of degrees two 
and three.) 

Caution. Because of the linear interdependence (due to the identities of 
Lie-Jacobi) Between the brackets of length w, the series of Theorem 15 is not 
unique; its computation depends on the arrangement of the brackets of each 
length w. 


4 


| 
| 
| | 
4 
4 
is 


GARRETT BIRKHOFF [January 


Corotiary 15.2. The function x o y=f(x, y) of composition of any ana- 
lytical group G under canonical parameters is analytical. 


Proof. By §24, brackets are polynomial functions. 


Coroitary 15.3. If the Lie albebra of G is “w-nilpotent” (that is, if all 
brackets of length w vanish), then f(x, y) is a polynomial of degree at most r. 

Coro.tiary 15.4. Two analytical groups having topologically isomorphic 
Lie algebras are locally topologically isomorphic (and so analytically isomorphic). 

Proof. Within some neighborhood of the identity, and under canonical 
parameters, they have the same function of composition. 

Coro.iary 15.5. Let L be the Lie algebra of any analytical group G, and 
let S be any closed subalgebra of L. Then the elements in S near the origin are an 
analytical subgroup nucleus. 

Proof. They are a subgroup (by Corollary 15.1), satisfy (1), (2), (2’), and 
are a complete linear subspace of L. 

From Corollary 5 and (31a), we get 


Coro.iary 15.6. The analytical subgroup nuclei of any analytical group G 
under canonical parameters, are the closed subalgebras of its metric Lie algebra. 


Corottary 15.7. A locally compact analytical group is a Lie group in the 
usual sense. 


Proof. Any locally compact Banach space is finite-dimensional by [1], p. 
84, and the function of composition is by Corollary 2 analytical under ca- 
nonical parameters. 


CoROLLARY 15.8. A commutative analytical group nucleus under canonical 
parameters is a neighborhood of the origin in a Banach space. 


38. Digression: paths and group-products. Since to assert 410 ---:O%, 
=V,0 OY, is to assert 


@Py,, 


and since every admissible path can be approximated arbitrarily closely by 
broken lines, one would expect product-equivalencest P~Q between images 
of an interval [0, A] to correspond to algebraic identities between group prod- 
ucts. We shall sketch in §38 some crude examples of such correspondences. 

The identity «y= yx(x, y) shows that if Q is any broken line, one can re- 
place any two segments of Q by the opposite sides of the parallelogram which 


t We recall the notation P~Q meaning ¢(P*) =#(Q*). 


1938] ANALYTICAL GROUPS 97 


they determine, without altering #(Q*), provided a small deviation (x, y) is 
inserted. 

The graphical principle (essential in the classical proofs of Green’s and 
Stokes’ Theorems) that any path-deformation can be split up into elementary 
deformations across parallelograms, is analogous to the algebraic principle 
that any permutation of terms in a sequence is the product of transpositions. 

The derivation in §34, given a path Q, of paths P~Q by choosing 
v*(0) =v*(A) =0 and setting 


dp = v*—"(A) odgo v*(A) + dv 


corresponds to taking a product #,0 --- ox, and a second product 
OuU,=e, defining --- ou, and proving by induction 
[@@#5! 0 x; 0 0- - 0 0 and thus concluding that 


k=1 k=l 
39. Digression: the Rearrangeability Principle. In correlating the argu- 
ment of §§34-36 with formal identities on group products, let us begin by 
recalling a recent result of P. Hall ([7], Theorem 3.1), namely 


(39a) (xy)™ = - - (mod Hy), 


where the z; are complex commutators in x and y of lengths <w arranged in 
order of increasing length, the exponents ¢; are polynomials of degree w(z;), 
and H,, is the normal subgroup whose elements are the products of commuta- 
tors of lengths =w. 

That there exist (not necessarily polynomial) functions $,(n) such that 
(39a) is satisfied, is very easy to show. For since uv =vu(u, v), one can trans- 
pose any two adjacent terms in any product involving x, y, and their com- 
mutators, by inserting commutators of lengths greater than the length of 
either transposed term. Hence one can first shift all the occurrences of x in 
such a product to the extreme left, then all the occurrences of y to positions 
just to the right of these, and similarly with 2, - - - , 2:. 

This method, combined with the rule that any permutation can be accom- 
plished by successive transpositions, obviously yields a general 

Rearrangeability Principle. If one is given any product y involving ele- 
ments x;,---, *, and their commutators, any integer w, and any ordering 
p of the x, and their commutators of weights <w, then y is congruent modulo 
commutators of weights >w to a product of powers of the x, and their com- 
mutators arranged in the sequence p. 


‘aa 
4 
z 
* 
4 


98 GARRETT BIRKHOFF [January 


More than this, one can distribute their occurrences according to any pre- 
assigned distribution function. 

These principles are the key to the algebraic situation. Using them, one 
can show for instance that if m=n”, then 


(398) sMy™ = Il ve (mod Hy), 
k=1 


where each 2, is of the and | &,(m; i) | 
<1, which means that the 2; are all nearly equal. 

Proof. Write Then transpose the 
occurrences of y (inserting commutators, of course) until you obtain the 
identity 


where the “, are congruent (mod H.,,) to products of the commutators 2; of 
lengths <w. Proceed by induction, dividing the occurrences of each 2, 
(h=1,--- , 2) into m nearly equal lots, and you will get (398). 

Now suppose « and y are elements of a continuous group under canonical 
parameters. Write x«"= and y"=9; since the v; are nearly equal, if we know 
by (278) that the elements of H., are relatively small, we see that | f(#, 5) —n2, | 
is small, where for m large 2 is nearly determined by «x, y, and the commuta- 
tion function [x, y]. 

40. Digression (cont.): analyticity and other remarks. We do not have to 
go far beyond the same principles to see from an algebraic standpoint even 
why an SCH-series exists, in the way that it does. 

To see this, observe that for fixed =%, and very large since x 
and y are correspondingly small, (1) products are nearly sums, and (2) com- 
mutators are nearly equal to the corresponding brackets. Hence if b, denotes 
the bracket in # and 9 corresponding to the commutator in « and y denoted 
by za, and 1)/n”@, then the smallness of | f(z, implies 
the smallness of | f(2, — {#+5+ > |. This gives one the first (¢+2) 
terms of an SCH-series, approximately. 

Actually, the \, are polynomials whose dominating terms are independent 
of m, although the reasons for this are number-theoretical and not at all 
trivial, and the calculation of the dominating terms is not even impossibly 
laborious. 

Similar reasoning yields an algebraic paraphrase of Theorem 15. Take 
any path X: x(A)=)77_,p:(A)-x;. Divide X into m=n” equal parts, set 


1938] ANALYTICAL GROUPS 99 


and «=y™, and obtain through the Rearrange- 
ability Principle (as with (398)), an identityt 


(402) (11 = (mod 22), 


k=1 i=1 k=l 


where the are products of nearly equal powers of commuta- 
tors z, in the y;. But replacing each commutator z, of length w, in the 4; 
by (m-“*)-b,, where b, denotes the bracket in the x; corresponding to 
z,—the substitution is nearly one of equals for equals—and setting y,(m) 
=nm-“*,(m; 1), (40a) becomes 


t 
(40a) is nearly >> 
h=1 


the calculation of ,(m) being the same for all groups. 
41. Every metric Lie algebra belongs to a group. We can now prove by 
considerations of convergence, that 


THEOREM 16. Every metric Lie algebra L is the Lie algebra of an analytical 
group nucleus. 


Proof. Define group products x o y=f(x, y) in L through the SCH-series. 
There are three points to establish: the convergence of the series, the validity 
of the inequalities (2’)—(2’’), and the associative law f(f(x, y), z) =f(x, f(y, 2)). 

By Remark 2 of §30, we can assume | [x, y]| <| x] -| y|. Then by the proof 
of Theorem 15 (cf. the remark after (36a)), if we substitute for each bracket 
in the SCH-series, the product of the absolute values of its entries, and if 
these are <1/10, then the sum of the absolute values of the resulting series 
is bounded by 2(|x| + |¥|). The convergence of f(x, y) provided |x| +|y| 
<1/10 is a weak corollary of this. 

Again, expanding [f(«, a)—f(x, b)]—(a—6) in SCH-series, we have by 
Theorem 14 after cancellation and pairing off of corresponding terms, scalar 
multiples of differences such as 


[x, la, «|, a| [x, [d, «|, b] 
= [x, [a — b, x], a] + [x, [b, x], a— 


whose magnitude is bounded by |x| -|a—| times the number 1; of entries 
in the bracket, times what we would get if we replaced every bracket by the 
product of the absolute values of all but one of its entries. But the sum 


t [mys*] denotes conventionally the integral part of my". 


\ 
| 
4 
f 
i 
i 


100 GARRETT BIRKHOFF (January 


of the products of these last two factors still converges absolutely provided 
|a| +|b| <1/20, so that ;(1/20)"*- <10(1/10)"*. Hence 


| (xa — xb)| —| (a S K-| 


within this region. This implies (2’); (2’’) follows by symmetry. It remains 
to prove the associative law. 

Here we do the obvious thing: substitute the SCH-series for u in the 
SCH-series for f(u, z), and likewise the SCH-series for f(y, z) for v in the 
SCH-series for f(x, v), and expand in both cases by the distributive law. We 
will get two series of monomial brackets in x, y, z, with possible repetitions. 
If they are absolutely convergent, then by the continuity implied in (2’)—(2’’) 
they will converge to f((x, y), z) and f(x, f(y, z)) respectively. We shall next 
prove that they are absolutely convergent. 

If |x| +|¥|+]|z| <1/80, and we replace each bracket in the series for 
f(f(«, y), 2) by the product of the absolute values of its entries, then the sum 
of the absolute values of what we get is by the distributive law (on scalars) 
what we would get if we replaced brackets by products in the SCH-series for 
f(u, 2), replaced z by ||, and « by the sum of the absolute values of the terms 
in the SCH-series for f(x, y). And since both of these are <1/40, the series 
for f(f(x, y), 2) is absolutely convergent. The absolute convergence of the 
series for f(x, f(y, z)) follows by symmetry. 

Hence to prove that f(f(x, y), s)=f(x, f(y, z)) we need only show that 
irrespective of m, the sum of the terms of length <m is the same for the 
two series. The demonstration of this essentially algebraic fact completes 
the proof. 

Demonstration. Form the multiplicative group of all non-commutative 
polynomials 7+U =7+),X+A2¥+\3Z+ --- in X, Y, Z, ignoring terms of 
degree >n. This is a (4"—1)-parameter Lie group, in which (J+U)-} 
+(-—1)"U". Since the group is analytical, the functions 
S(f(X, Y), Z) and f(X, f(Y, Z)) are identically equal near X = Y =Z =0, and 
hence formally equal. Moreover as in all linear groups, [U, V]=VU—UV. 
But by Theorem 3 of the author’s Representability of Lie algebras and Lie 
groups by matrices, Annals of Mathematics, vol. 38 (1937), pp. 526-532, any 
identity between alternants VU —UV follows formally from the identities of 
Lie-Jacobi. Hence the equality between the sums of the terms of degree <n 
in the two series follows formally from the identities of Lie-Jacobi (which we 
assumed at the beginning). 


ANALYTICAL GROUPS 


BIBLIOGRAPHY 


1. S. Banach, Théorie des Opérations Linéaires, Warsaw, 1932. 

2. G. Birkhoff, Integration of functions with values in a Banach space, these Transactions, vol. 38 
(1935), pp. 357-378. 

3. J. E. Campbell, On a law of combination of operators, Proceedings of the London Mathematical 
Society, vol. 28 (1897), pp. 381-390. 

4. E. Cartan, Notice sur les Travaux Scientifiques de M. Elie Cartan, Chap. VIII, Paris, 1931. 

5. E. Cartan, Les groupes continus et analysis situs, Mémorials des Sciences Mathématique, 
Fasc. 42, Paris, 1930. 

6. J. Delsarte, Les groupes de transformations linéaires dans l’espace de Hilbert, Mémorials des 
Sciences Mathématiques, Fasc. 57, Paris, 1932. 

7. P. Hall, A contribution to the theory of groups of prime-power orders, Proceedings of the London 
Mathematical Society, vol. 36 (1933), pp. 29-95. 

8. F. Hausdorff, Die symbolische Exponentialformel in der Gruppentheorie, Leipzig Berichte, vol. 
58 (1906), pp. 19-48. 

9. S. Lie, Transformationsgruppen, Leipzig, 1888. 

10. S. Lie, Unendlich continuirliche Gruppen, Abhandlungen, Sachsische Gesellschaft der Wissen- 
schaften, vol. 21 (1895), pp. 43-150. ‘ 

11. W. Mayer and T. Y. Thomas, Foundations of the theory of continuous groups, Annals of 
Mathematics, vol. 36 (1935), pp. 770-782. 

12. J. von Neumann, Gruppen lineuren Transformationen, Mathematische Zeitschrift, vol. 30 
(1929), pp. 3-42. 

13. S. Saks, Théorie de ? Intégrale, Warsaw, 1933. 

14. L. Schlesinger, Newe Grundlagen fiir eine Infinitesimalkalkul der Matrizen, Mathematische 
Zeitschrift, vol. 33 (1931), pp. 33-61. 


15. O. Schreier, Abstrakte kontinuierliche Gruppen, Hamburger Abhandlungen, vol. 4 (1926), pp. 
15-32. 

16. F. Schur, Neue Begriindung der Theorie der endlichen Transformations gruppen, Mathe- 
matische Annalen, vol. 35 (1889), pp. 161-197. 

17. V. Volterra, Sulle equazione differenziali lineari, Rendiconti dei Lincei, vol. 3 (1887), pp. 
391-396. 


HARVARD UNIVERSITY, 
CAMBRIDGE, Mass. 


1938] 101 
i 
2 


ON AN APPROXIMATE FUNCTIONAL 
EQUATION OF PALEY* 


BY 
W. C. RANDELS{ 


The purpose of this paper is to present and extend a method which 
has been found useful in the construction of “gegenbeispiels.” The methods 
of proof have all been used before by various writers, but the conditions used 
here are more general than those considered by the other writers. 

If we have a sequence of positive numbers {a,} such that 


and such that the function, 


f(z) = Do 
n=1 
is analytic inside the unit circle, we know that f(pe*) will tend to infinity as 
p—1, at least for 6=0. If we introduce a sequence of factors {e®} (b, real), 
the function 


f(z) = > 


n=1 


will tend to have its singularities spread over the unit circle and the order of 
MAaXo<e<2xf(pe), as p—1, will be decreased. This suggests the problem of de- 
termining a sequence {6,} corresponding to the sequence {a,} such that the 
order of f(pe*) as p—1 is the same for all values of 0. 
The first result of this nature was given by Hardy and Littlewood [2].t 

They considered the case of a, =n*-”? and they proved that 

fa(z) = 8-1/2 exp [ian log = + (a), § 

log a onl 

where 


* Presented to the Society, February 23, 1935; received by the editors February 11, 1937. 

+ This work was done while the author was a Sterling Research Fellow at Yale University. The 
problem was suggested by Professor Hille 

t The numbers in brackets refer to the references at the end of the paper. 

§ To make the printing easier we shall sometimes write exp [x] instead of e”. 


102 


Da, = 
n=1 
= 


AN EQUATION OF PALEY 103 


a 
= ———, z=pe®, p= = alog{—}, 
(log et 


F(c) = 


v=1 


A+o0(1), aso—0, if B <3, 
1 
¢(c) = O(log =), if B = 3, 
o 
O(a-8+1/2) | if B >. 
The function F(c) is similar to the Weierstrass non-differentiable function, 


and extensive work by Hardy [1] on that function makes it possible to show 
that fs(z) is continuous for || =1 and B<0, and 


of B>0, 


Moreover they showed that* 
(2) folz) = asp—1, 
As a consequence of (1) and a theorem of Hardy and Littlewood [4, Theo- 
rem 4] it follows that 
(O(k-*), —1<B<0, 
(3) w(fa, h) | fa(e®) — fe(e™)| = -), 


1 


Hille [5] has given a proof of (1), but by his method one cannot prove (2). 
Hille, and Ingham [6], considered the similar functions 


for(2) = <1, y > 4B +1), 


n=2 


and showed that these functions are continuous for |z| =1. 
It is natural to try to prove similar theorems for functions of the type 


f(z) = 
n=1 
It is convenient to assume that the functions b(x) and A(x) are defined for 


* By the notation f(x)=©{g(x)} we mean that f(x) { g(x)}. 


2r 
log a 
and i 
i 
x 
| 
(1 (zs) = 
| 

| 


104 W. C. RANDELS [January 


all values of x on the interval (1, ©) and we shall do so from now on. This 
problem was first handled by Paley [7] who showed that, if 2,(0) is defined 
by the relation 


A’{n,(0)} = — 8, 


then 


f(z) > 


n=1 


n(8) 


1 | exp ] + 0n,(6) 


— 2nvn,(6)}] + R(z), 


where R(z) is continuous for |z| =1. Paley’s results however were not as gen- 
eral as ours for he used a condition of the form 


b(n) = O(x-/2), < 1/10. 


Wilton [8] working on the same series when |z| =1 proved the equation. 
Sn(e*) > b(n) et (™ ein? 


m,(8)} 
©) (a”’{n,(6) 


Men) [1/2 
+00) f b(x) ] az), 


A’(x) 


exp [i{A[n,(6)] + — 2xn,(6)} ] 


where 7,(0) Sm <n,4:(0). Wilton mentions that his equation is in general a 
special case of one of van der Corput. We notice that neither the method of 
Paley nor that of Wilton will prove (1) or (2). 

The purpose of this paper is to give an extension of Paley’s methods so as 
to include (1) as a special case. In §1 we prove a functional equation similar 
to (4) except that R(z) is not necessarily continuous. The functional equation 
of §1 could possibly be gotten from Wilton’s (5) by replacing d(x) by p7b(x) 
but it would involve at least as many complications as the proof given here. 

In §2, it is shown by a method of Hardy [1] that 


G(z) qril2eril4 
[4’’(m,) 


n=1 


1938] AN EQUATION OF PALEY 
where, if 2,(0) <x <n,4:(0), 


b b( 


We also show that 


if 
(D) M(x) = of 
x) [a’(x) ° 


The remainder function R(z) is studied in §3 and it is shown that if M(x) 
is bounded, R(z) is continuous for |z| =1 and otherwise 


In §4 we use a method of Hardy and Littlewood [4] to show that if f(z) 
is analytic for |z| <1 and continuous for |z| =1, and 


f(s) = Ofg(| )}, 


where g(x) T © as x1, then 


w(f, h) = | f(e®) — f(e®)| = off 


—h 


provided the integral exists. 

Wilton [8] has given some applications of his equation so that in dis- 
cussing applications we confine ourselves to cases which have not been cov- 
ered by Wilton. It is pointed out that (1), (2), and (3) follow from this theory 
and also certain generalizations of them are possible. Ingham’s results also 
follow as pointed out by Wilton, and furthermore 


1 \—7—-8/2+3/2 B 3 
w( fey, = 04 (log —) if 


where fs,(z) is defined by 


fey(z) = > n~'!2(log exp lif 


n=2 


It is finally shown that, if b(x) satisfies conditions (A), (B), and (C) of §1, 
and 


1—p 


W. C. RANDELS [January 


f = O{x'*%(x)}, < 1/30, 


f b2(x)dx = 
1 


then, if A(x) is defined by 


A(x) = dy, 


the functions A(x) and b(x) will satisfy condition (D) of §2 which implies 
that the singularities of the function 


f(z) = b(n) 
are distributed uniformly over the unit circle, and the order of f(pe) as p—1 
is the same for all values of 6. This answers a question proposed in the opening 
paragraph. 
1. The functional equation. The method used in this section is essen- 
tially that of Paley and the notation is chosen to agree with his. If the func- 
tions b(x) and d(x) are given we define 


A(x) d(y)dy, 


Zz 
d{n,(6)} — 6, 050< 2r, 
F(v; 0) = A{n,(0)} + 0n,(0) — 


We impose the following conditions on b(x) and d(x) :* 
(A) The function d(x) has two derivatives, continuous on the open in- 


terval (1, ©) and 
d(x) 
d'(x) = O(a~"), 
1/d'(x) = O(x'**), 0<e< 1/30, 
d’'(x) = O(x-*), n< 1/10 —e. 
We notice that the first condition on d(x) implies that 2,(@) T © as vo. 


* It would obviously be sufficient to assume that these conditions are satisfied only for x greater 
than a certain xo; but to simplify the work we carry through the proof for x»=1. 


106 
and 


1938] AN EQUATION OF PALEY 107 


(B) The function 6(x) is monotone and has one derivative continuous on 
(1, ©) and 


b’(x) = 


The function d’(x) and 6(x) satisfy the condition: 
(C) There exists a constant k>1, such that for each function 


f(x’) < Rf(x), kf(x’)>f(x), if x Sex. 


If we had restricted ourselves to functions of the logarithmic-exponential 
type, condition (C) would mean that the functions could not be like e* or e-?. 
We notice some consequences of (C). If f(x) satisfies (C), then 


1 k k 1 
—-<—,, —~> 
f(x’) f(x) S(*’) f(x) 


iss 


If two functions f;(«) and f2(x) satisfy (C) with constants k, and ke respec- 
tively, then for x <x’ <ex, 


Sila’) fol a’) < Rikefi(x)fo(x), kikofi( x") fo(x’) > fo(x). 
If f(x) satisfies (C), f(x) >0 then for x<x’ Sex, 


f < ek J 


This means that, if the functions f;() and f2(x) satisfy (C) and are positive, 
then the functions 


aa’ fi(x)fe(x), fily)dy, 
also satisfy (C) for x>a>1 for some a. If f(x) >0 satisfies (C), then for 
Sx’ <etly, 
f(*’) 
f(x) f(x) f(e"x) 
If f(x) >0 satisfies (C), then 


< < (2’/2) “(=)”. 
x 


4 
i 
and 


W. C. RANDELS [January 


¥ pin) < f 


n=1 


f f(x)dx = f(x)dx < f(n), 


n=1 


so that the series and integrals are equiconvergent. 
We wish to obtain an approximate functional equation for the function, 


= b(n) exp [iA(n)]e*. 
n=1 
We first consider the group of terms* 


r+s r+s 


b(n) exp [iA(m) = c(m) exp [iA(m) + ind], 


r—8 r—s 


where r=n,, s=n,*/5, and where c(x) =b(x)p?. We have 


> exp [iA(n) + ind] — exp [iA() + 


r—é r—s 


- of + — c(n) | 
= of max at 


|x—n,|Ss x 
= Of (1 — o(n,)p 
since by (C) and (B), 


c!(x) = b(x)p" log p + '(x)p = Of (1 — p) +m, }, 
gn, x S 


ny/2 a 


By (A) 
r+s r+s 
> exp [iA(m) + ind] -f exp [iA(x) + ix? — 2rivx]dx 


r—s 8 


d 
= n, max | — [iA(x) + — 2 rivx] 
dx 


|x—ny| Ss 


|x—ny| Ss Ny 


* By the symbol _— we shall understand the sum over all values of » such that agn<b. 


108 
and 
= 


AN EQUATION OF PALEY 


= of max = 


|x—ny|Ss 


We have, again using (A), 
3 
A(x) + «0 — = F(v; 0) + (x — n,)?2d’(n,) + 
so that 


r+s r+e 
f exp[iA(x) + — 2rivx|dx — expliFs0)] f exp — m,)?d’(n,) |dx 


By a simple change of variable 


r+s t 
f exp [i(x — n,)?d'(n,) |dx = 2[a'(m) f 
rs 0 
where ¢t=n,*/5[d’(n,) |-”?, and we know that 
0 

By considering the graph of the function cos x? we see that {” cos x* dx can be 
represented as an alternating series for which the absolute value of the terms 
is steadily decreasing. Therefore 

f cos x? dx = Of [d’(n,) 

t 

By applying similar considerations to sin x? we see that 


= of [d’(n,) ]*/2n;-3/5} 


Hence 


r+s 
(1.3) f exp [iA(x) + ix0 — 2wivx]dx — |-1/2 exp [iF (v; 6) 


4 

1938] 109 

j 

| 


110 W. C. RANDELS [January 


This implies that 


r+s 


(1.4) exp [iA(m) + ind] = exp [iF 0) 


6/5 ny/2 1/5+5 ny/2 2/5+9 


+O{(1 — p)b(n,)n, p + b(n,)n, b(n,)n, 
We set a(x) =A(x) + 0x, y(x) =a(a+1) —a(x). We have then 


{eiv(m) — 1} 
r+8s r+s 1 


ety(n) om» 


where 7; =%,41, 5: and if we define 
z 
H(x) = { eiv(m) 1} = — gia) = O(1), 
1 


this gives us 


r+s8 r+s 1 


Integrating by parts we have 


c(x) dH(x) = c(x)H (x) 


— 1 — 1 


d c(x) 
-f H(x) — 
ote dx —1 


By the definition of y(x) and property (A), 

v(x) = d(x) +Of{d(x)} +0, = d(x) + Ofd(x)} 
and 

y(n,) — 2rv +0 = O{d’(n,)} 
Consequently by (C) there exists a constant M>0 so that for v sufficiently 
large 
3/5 r+s 
y(n, nm, ) — +06 = y(n,) — + 0 +f y'(x)dx 
> ne” min [d’(n, + x)] — Of{d’(n,) + 
Osz<s 


3/5. —l—e 


and by (A) for x >n,+n,*, there exists an M’ such that 


1938] AN EQUATION OF PALEY 111 


(x) — +0 > + min [d’(y)] 


r+ssysx 
> M(x 
If we define N, by the relation 
d(N,) = (20 + — 8, 

then since y’(x) =d(x+1) —d(x) >0, we have for x<VN,, 

| — 2v + + 0| 
We know however that, for 2rv<y<(2v+1)z, 

(e — = Of (y — 2mv)“}, 
and therefore 


l+e | 3/5 
(x—m) m+n, 


(1.5) 
By a similar reasoning 


(1.6) 1) = Of x) } N, Ss x = — 


and 


(1.7) — 1)? = 04( 


Therefore, by (C), 
c(x)H(x) 


2/5+6 
r+s 


We also have* 


d c(x) 
H(x) dx 
dx —1 
2Qny Ny m4, /2 d c(x) 
r+s Np | dx — 


= + I2 + I3+ 


Using (1.5) and the fact that y’(x) =O(1/x) we have 


3/5 


* In case 2n,>N, we drop Js, if Ny> 4341, we drop and if 42,41 we drop J;. 


It is essential that »,-+7,2/*<n,4,—n3.,. This will be shown later (cf. (2.1)). 


i 
} 
| 
Myr My 41 
—-— , 
21 21 2i 


W. C. RANDELS 


x— Nn, rts (4% — 
and by (B) and (C) 


0} J [(1 — p)b(m, + x)(m, + 
(1.9) : 


l+e ny 


= O{[(1 — p)b(m,)n, p+ log n,} 


and 
(1.10) 7/’ = c(n, + x)(n, + x) ax\ = Ofc(n,)n 


Similarly 
I, = OU? + If’), 


where 


— p)b(x)p? + x*'b(x)p?|dx 


(1.11) 


and 


(1.12) = of 


2n, 
Also I;=O(I§ +13"), and by (1.7), if 


uniformly for 7 =0, 1, - - - , so that 


Ny+1/2 
(1.13) If = of — p)b(n)p" + 


Nv 


Ny+4/2 
Ny 


Finally and by (1.6) 


[January 


dx 
+ B(x + m)(x + =} 
x 


112 


AN EQUATION OF PALEY 


If = f [(1 — p)b(x)maip + b(x)x — x) dx 
My 41/2 


(1.15) 


8-1 


ny 41/2 


= Of log — 
2/5+2e 


(1.16) Tf’  b(y41)p 


ny+1/2 
From (1.4) and (1.8)—(1.16) we get the principal result of the paper that 


= = G(z) + R(z), 


n=1 


where 


G(z) = b(n,) [d’(n,) pn» 


v=1 


and, if a=max(n, 2e), 


R(z) = of [1 — + b(n.) 


6/5+€ n,/2 2/5+a 


+ (1 — p)b(m,)n, p” + (n,)n,  p 


+ >> [b(n)n*—1p" + (1 — = of 


n=1 i=1 


We notice that this procedure is valid only for |z| <1. We shall now give 
a functional equation of the type used by Wilton (5) which is valid for | z| =1. 
The equation is 


m 
(1.17) b(n) eA (™eind = b(n,) [d’(n,) + R(m, 8), 


n=1 v=1 


where n,+n,3/5<m Sn,4:+n3,. It can be seen that the sums 


ry+8 

> b(n) eiA(meind v<up, 

r+s 

— 3/5 
DL m M41 — M41; 
u+v 


where u=n,, v=n,°/5, can be handled by the methods which we have used 
and moreover 


= Of [d’(m) , 


where = 1: =734,. This shows that 


i 

1938] 113 

4 


W. C. RANDELS [January 


v=l 


2. The order of G(z). If the series 
[d’(m,) 
v=l 


converges uniformly in 6, then the function G(z) tends to a continuous func- 
tion of 6 as p=|z|—>1. We shall now suppose that the series does not converge 
and investigate the order of G(z) as p—1. We notice some properties of u,(@). 
First, since d(x) is monotone, ,(@) T © as vy and u,(@) is monotone de- 
creasing as a function of @ on (0, 27). Also 


n,(0) = Ny+1( 22). 


Finally, since 
d}n,(6)} = 2xv — 0, 


we have 
d 
d’{n,(0)} —n,(0) = 2x 
dv 


and, since d’(x) =O(1/x), there must be a constant c>0 such that 


v+1 d 
(2.1) y41(0) = n,(0) +f n,(0)du > n,(0) + cn,(0) = (1 + c)n,(8). 
v 


We let h(x) =b(x) [d’(x) and, if n,(0) <m,4:(0), we define 


M(x) = >> max h(y)+ max A(y). 


()<y<x 
If A is greater than 1 and ,(0) <*<m,4:(0), ,-(0) S<Ax<n,-4:(0), then by 
(2.1) we have 
pw’ — < log Allog (1+ 
and by (C), h(y) =O{h(x)}, x<y<Ax, so that, if 1,(0) <*<m,4:(0), then 


(2.2) M(Ax) => max h(y) +04 max ny) = O{ M(x)}. 


y=2 My—,(0)<y<n,(0) n,(0)<y<x 


We write 


| G(z)| < h(m,)p™ = h(n,)err, y = log p, 


ve=l 


114 


1938] AN EQUATION OF PALEY 


and define so that —1/y<m,4:(0). Then 


A(n,)er = > h(n,)ev + >> h(n,)ev 


v=1 


and, if 


= of > = M[n,(6)]}. 


v=1 v=1 


By (C) 


h(n,)ev = i(- > h(n,) | (- 
VS 


Since —n,y >(1+c)’-“(—n,y) >(1-+c)”-*-!, we may find an A and an m not 
depending on y so that 


nyy + log k log (— n,y) < — A(v — yp), for (v — pw) > m. 
This shows that 


> h(n,)ev = ofi(- > exp [my + log k log (— 


v=p+l1 p=v+1 


Hence by (2.2) 


(2.3) G(z) = 


We shall now use the additional hypothesis 
(D) M(x) = O{h(x)} 
and we propose to prove that if (D) is satisfied then 


The method is essentially that used by Hardy [1] for a similar problem. We 
take the series 


> ng h(n,)e¥"ve'F 


115 
1 
y 


116 W. C. RANDELS [January 
and let y= —a/n,. Then 


ng h(n,)ere® | > ng h(n,)e-* — ng h(n,)ev™ — nz h(n,)er. 


v==l 


By (D) we have h(n,) =O{h(n,)}, v<m, and hence there exists a \>0 and 
independent of a, v, and X, such that 


OG)” 


— 
nz h(n,) exp [— an,/n,] < (~) 
Ny 


v=l 


Ny n, 
= Ah(n,)nge-* >, exp | - a (= — 1 — log 
n 


v=l Ny 


Therefore 


v=1 


But, since 2,/n, for <p, there exists a p>0 such that 
Ny Ny 

—-—-1—log—> for r<~z, 
Ny 

and, for .—v >, u depending only on c, there exists an A so that 


Ny 


Ny 
——1—log— > Allog (1+ 0) ](u — »). 
Ny Ny 


Therefore 


(=) | - «(= -1)] exp [- at eg (1 +9) 


v=1 Ny 


+ > < exp [— log (1 + c)] + = 0(1) as a> om. 


We have also by (C) 


dX nz h(n,) exp (— an,/n,) 


be Ny a+logk Ny 
= of ne h(n,)e-* >> exp | - a (= 
Ny Ny 


Ny ny Ny 
= 04 nz hinder | - a — 1— log + log k log 
Ny n 


and, since n,/n,>(1+c) for vy >, there exists a p>0 for which, 


po 
) 


AN EQUATION OF PALEY 


Ny Ny 
—-—1—lg—> ?, 
Ny Ny 
and hence for @ greater than a suitable ap, there is a po) >0, such that 
Ny Ny Ny 
-i- — log k log — > apo. 
Ny Ny Ny 
For v—y>w, w independent of a for a>ao, 
Ny Ny, Ny 
-1i- + log k log — > allog (1 + — 
n 


Ny Ny 


Hence 


Ny ny, Ny, 
> exp | - «(= — log k log =| 
Ny Ny Ny 


< + > exp [— av log (1+ ¢)] = 0(1) 
Consequently we can find an a so that for a>a, 
ne h(my) exp (— an,/m,) + h(n,) exp (— an,/n,) < h(n,)e~* 
v=1 


uniformly in @ and yp, and by (C), for y= —(a/n,), 


a 


1\¢ 1 
G(z) | > h(n,)e-* > o(- --) i( - -), w depends on a. 


p* y 


We can easily see that 


Therefore by a theorem of Hardy and Littlewood* we must have 


: 


Therefore under condition (D) we have an exact characterization of the order 
of G(z) as |z|—1. 
3. The order of R(z). By the above method we can easily show that 


* Hardy and Littlewood [3]. This result follows from Theorem 8 on setting ¢=y=h(—1/y). 


1938] 117 
d 
dp* 


118 W. C. RANDELS [January 
where, if 2,(0) <x <m,4:(0), 


Ki(x) = max b(y) + max b(y)y?/5te, 


y=2 my—,(0)<y<n,(0) n,(0) <9 <x 


where, if 2,(0) <x<m,4:(0), 


K.(x) = >> max b(y) + b(y) = O{ «*/5K,(x)}. 


(0)Sysn,(0) <y<x 


Similarly we can show that 


(3.3) =0 


y=1 


and 


(3.4) > b(n, ny/2 


v=1 
Since 2/5+a<4, we have 


of [d’(x) ]-1/2} 


and hence, if M(x)» asx-~, 
Ki(x) = o{ M(x)}. 


Therefore, if M(x)» 


1 
(3.5) R,(z) = j = 1, 2, 3,4. 
1—p 


If N< —1/y<N+1, y=log p, by (C) 


b(2iN) 


j=0 n=2iN+1 


> = of 


n=N+1 
and, since 


we have 


AN EQUATION OF PALEY 


n=N+1 j=0 


of b(N) N21} 
Therefore 


{1/(i—p)] 
(3.6) = 04 


Similarly 


[1/(i—p)] {1/(1—p)] 


(3.7) R,(z) = — p) = of > 


n=1 


If the series 
b(m,) [d’(n,) 
v=1 


converges uniformly in 6, it is easily seen from the above estimates that the 
functions R;(z) tend to a continuous function of 6 as p—1, for i=1, 2, 3, 4. 
Moreover, since d’(x) =O(1/x) we must have b(x) =O(x-"/?) so that 


> b(n)n2— 
n=1 


converges and 
b(n) n*p” = of(1 
n=1 


Therefore in this case R(z) is continuous for |z| =1. 
We now wish to give some estimate for 


and compare it if possible with M(x). If b(~) is increasing, by the monotonic- 
ity of b(x) 


b(n)n2—! = Of b(N)N**} 


and, since by (A) 
b(N)N/? = O{ M(N)}, 


= (— j=5,6. 


we see that 


1938] 119 
n=1 q 
n=1 


120 W. C. RANDELS [January 


If (x) is not increasing the problem might have to be considered differ- 
ently in separate cases. However if we make the reasonable assumption that 


> b(n)n2—! = ety <h, 


then we see that in this case 


Ri(s) = of 


This assumption is always satisfied for functions of the logarithmic-exponen- 
tial type which is the type of function used in applications of this method. 

4. The modulus of continuity. If a function f(s) analytic for |z| <1 is 
continuous for |z| =1, we define the modulus of continuity of f(z) as the 
function 

w(f, h) = l.u.b. | — fie) |. 
16: <h 

We wish to develop a method of finding the order of w(f, 4). Let us suppose 
that there is a function g(x) such that g(x) T © as x1, then we shall show 
that if 


= Ofg(o)}, 


w(f, = of f 


then 


provided the latter integral exists. The method used has been used by Hardy 
and Littlewood [4]. 
We have 


| fle) — fle) | = lim | foe) — |, 


so that we need only show that 


— #,)| =O d 
| foe) — f(pe's | { J 


uniformly in p<1 and |0,—62| <h. Since f’(z) is analytic for |z| <1, 


floc) — five) = f + f + 


where /; is the integral taken along the radius 6=62 from p to p—h, and f; 


1938] AN EQUATION OF PALEY 121 


is the similar integral along 0 = 0, from p—hk to p, while /2 is the integral along 
the arc |z| =p—h from @ to 6,. Then 


= | — h)} = Of hg(1 h)}. 


Since g(x) T © as x1, 


g(x)dx 


Similarly 


= of J 


Finally we notice that, since g(x) T © as x1, 


1 
hg(1 — h) <f g(x)dx. 
1—h 


h) = off 


5. Applications. To illustrate the scope of this method we shall apply it 
in detail to a particular function which has already been considered and point 
out some extensions of the usual results which may be obtained by this 
method. The function which we shall consider is 


Therefore 


far(z) = > n-/(log n)-7 exp lif (log 0s 
n=2 1 
This is similar to the function 


n-/2(log n)-7 exp [in(log 

n=2 
considered by Ingham [6]. Ingham’s function could also be handled by these 
methods but the details would be slightly more complicated. 


«@ | 


122 W. C. RANDELS [January 


For the function under consideration 
d(x) = (log x)®, b(x) = a—'/2(log x)-7 


and 
(log = — n,(0) = exp [(2av — 6)1/8], 


Therefore 
d'(x) = Bx-(log x)®-! = O(a"), since B <1, 
(A) = x(log = O(x'**), for every e, 
d(x) = Ox); 
b'(x) = — 3479/2(log x)-7 — yx-3/2(log = Of x)-7} 
= Of b(x)a} ; 
e*d'(x’) > d’(x), x 
e?b(x’) > B(x), x 


(B) 


(C) 


Since 7 = 6=0 and e can be made less than 1/30, we have 


7/10 ny 


= O4(1 = 6) 


+(1—p) 


n=1 
—7/10 


=O{(1—p)(1—p) +1}, 


so that R(z) is continuous when |z| =1. The function G(z) is continuous for 
=1 if 


—7/B—1/2+1/ (28) 


(5.1) > n, n,) (log = — 8) 
v=2 


v=2 


converges, and the necessary and sufficient condition that (5.1) converge is 


1 1 1 
i, 1). 
This corresponds to the result obtained by Ingham. 
We shall now consider the modulus of continuity of fs,(z). We notice that 
the function b(x) =x"/*(log x)~7 satisfies (B) and (C) so that the discussion 
of §$§2, 3 applies to the function 


fay(2) = > n'!2(log exp doe aypas\ s*. 


n=1 


—1/20 n, 
v=2 
—6/5 
n 
\ 
n=1 
co 


1938] AN EQUATION OF PALEY 


Since 


M(x) = n,_s(log 4 a(log x)—1-8/2+1/2 


1 —y—B/2+1/2 
But 
1 1 —y—B/2+1/2 h 1 —y—8/2+1/2 
f (1 — ( )] dp = f (toe -) dx 
a 1—p 0 x 


— 1 1 \—7-8/2+3/2 
= a—"(log = (log -) 
y+ B/2 — 3/2 


so that 


and hence 


1 \—7-8/2+3/2 
04 (tos -) 


This result could not be obtained by the other methods to which we have 
referred. 
6. Further applications. If 


= @, 


we know that the function 
f(z) = 
n=1 


can not tend to a continuous or even a bounded function as p=|z| 1. We 
wish now to study the behavior of f(pe) for a particular choice of d(x). 
We assume that (x) satisfies (B) with 6<1/10—e, and (C) and, 


(E) f = e< 


30 
d(x) = tog] f 


We define 


123 
3. 
2 2 
3B 
— 


124 W. C. RANDELS [January 


Then by hypothesis d(x) f ©, asx, and 


d'(x) = ~ 


1 
(x) 10 € 


= 00-4), 


1 
d’(x) 
so that the function d(x) so defined satisfies (A). It is also clear that since 
b(x) satisfies (C), d(x) will also satisfy (C). We have 


1 
= O(x'**), <— 
(x!**) 


b(x) [d’(x) = (f © as 


and 
b(n,) [d’(n,) = 
Therefore 


M(x) = b(n,(0)) [d’(,(0)) + b(x) [d’(x) 


ny (0)Sz 


= e+ d(x) [d’(x)]-/? = b(x) 


ny(0)Sz 


and condition (D) is satisfied. By the argument of §3 


1 [1/(1—p)] 
1 

and 

N N 1/2 N 1/2 

d(n)n2 < | | | 2. niet] 

1 1 1 

= 0|M(N)}, 

so that 


R(z) 


(; : 


Therefore by the argument of §2, 


f(z) = of (—)} of( 


1938] AN EQUATION OF PALEY 125 


This answers the question mentioned in the introduction about the possi- 
bility of choosing d(x) so that f(z) have the same order as |z|—1 for every @. 

We might apply these considerations to the case where b(x) =x~-/?. We 
can readily obtain the result for this case that if 


and 


d(x) = log if = log log x 
then 
and 


We might compare this with the results for the function 


fol(z) = exp [in log n]z” 


n=2 


obtained by Hardy and Littlewood which are mentioned in the introduction 


((1) and (2)). 
REFERENCES 


1. Hardy, these Transactions, vol. 17 (1916), pp. 301-325. 

2. Hardy and Littlewood, Proceedings of the National Academy of Sciences, vol. 2 (1916), pp. 
583-586. 

3. Hardy and Littlewood, Proceedings of the London Mathematical Society, (2), vol. 11 (1912), 
pp. 411-478. 

4. Hardy and Littlewood, Mathematische Zeitschrift, vol. 28 (1928), pp. 612-634. 
. Hille, Journal of the London Mathematical Society, vol. 4 (1929), pp. 176-182. 
. Ingham, Annals of Mathematics, (2), vol. 31 (1930), pp. 241-250. 
. Paley, Proceedings of the London Mathematical Society, (2), vol. 31 (1930), pp. 301-328. 
8. Wilton, Journal of the London Mathematical Society, vol. 9 (1934), pp. 194-201, 247-256. 


SIAM 


NORTHWESTERN UNIVERSITY, 
Evanston, Int. 


ON THE SOLUTIONS OF QUASI-LINEAR ELLIPTIC 
PARTIAL DIFFERENTIAL EQUATIONS* 
BY 
CHARLES B. MORREY, JR. 
In this paper, we are concerned with the existence and differentiability 
properties of the solutions of “quasi-linear” elliptic partial differential equa- 
tions in two variables, i.e., equations of the form 


A(x, 2, Pp, q)r + 2B(x, 2, q)s + C(x, 2, D(x, 2, q) 


AC — B?>0,A>0 S= —— 
Ox oy Ox? Oxdy Oy? 


These equations are special cases of the general elliptic equation 
$(x, 2, 9,7, 5; t) = 0, o@: — > 0, > 0. 


The literature concerning these equations being very extensive, we shall 
not attempt to give a complete list of references. The starting point for many 
more modern researches has been the work of S. Bernstein,f who was the 
first to prove the analyticity of the solutions of the general equation with ¢ 
analytic and who was able to obtain a priori bounds for the second and higher 
derivatives of z in the quasi-linear type in terms of the bounds of | 2], | p|, |¢| 
and the derivatives of the coefficients. He was also able to prove the existence 
of the solution of the quasi-linear equation in some very general cases. He 
assumed that all the data were analytic. However, his papers are very compli- 
cated and certain details require modification. On account of the results of 
J. Horn, L. Lichtenstein, and many others,{ the restriction of analyticity has 
been removed. Some very interesting modern work has been done by Leray 
and Schauder§ in a paper in which they develop a general theory of non- 
linear functional equations and apply their results to quasi-linear equations, 


* Presented to the Society, April 11, 1936; received by the editors February 9, 1937. 

} Particularly the papers: Sur la nature analytique des solutions des équations aux dérivées par- 
tielles du second ordre, Mathematische Annalen, vol. 59 (1904), pp. 20-76 and Sur la généralization 
du probléme de Dirichlet, Mathematische Annalen, vol. 69 (1910), pp. 82-136. 

¢ For an account of some of these results, see the article by L. Lichtenstein on the theory of 
elliptic partial differential equations in the Encyklopadie der Mathematischen Wissenschaften, vol. 
II 32, pp. 1280-1334. 

§ Jean Leray and Jules Schauder, Topolugie et équations fonctionelles, Annales Scientifiques de 
l’Ecole Normale Supérieure, vol. 51 (1934), pp. 45-78. 


126 


ELLIPTIC DIFFERENTIAL EQUATIONS 127 


Schauder* has also obtained good a priori bounds for the solutions (and their 
derivatives) of linear elliptic equations in any number of variables. 

In the present paper, an elliptic pair of linear partial differential equations 
of the form 


(1) v2 = — (bot, + cuty + €), vy = auz + bu, + d, 4ac — (6b; + be)? >m> O, 


is studied. We assume merely that the coefficients are uniformly bounded 
and measurable. In such a general case, of course, the functions u and v do 
not possess continuous derivatives but are absolutely continuous in the sense 
of Tonelli with their derivatives summable with their squares (over interior 
closed sets). However, certain uniqueness, existence, and compactness theo- 
rems are demonstrated and the functions u and v are seen also to satisfy 
Holder conditions. These results are immediately used to show that if 2(x, y) 
is a function which minimizes 


ff f(x, y, 2, p, Qdxdy, (foefaa — fa > 9; fon > 0) 
R 


among all functions (for which the integral may be defined) which take on 
the same boundary values as 2(x, y) and if 2(x, y) satisfies a Lipschitz condi- 
tion on R, its first partial derivatives satisfy Hélder conditions. E. Hopft has 
already shown that if p and q satisfy Hélder conditions, then the second 
derivatives satisfy Hélder conditions. In proving this fact, Hopf shows that 
if the coefficients in (1) satisfy Hélder conditions, the first partial derivatives 
of u and v satisfy Hélder conditions. The results of this paper concerning the 
system (1) together with Hopf’s result yield very simple proofs of the exist- 
ence of a solution of the quasi-linear equation in certain cases, a few of which 
are presented in §6. 

The developments of this paper are entirely straightforward, being, for 
the most part, generalizations of known elementary results analogous to the 
step from Riemann to Lebesgue integration. The main tools by means of 
which the Hélder conditions of u and v in (1) are demonstrated are Theorems 
1 and 2 of §2 which state roughly that: if §=£(x, y), n»=n(x, y) isa 1-1 differ- 
entiable transformation of a Jordan region R into another Jordan region = 
in which the ratio of the maximum to the minimum magnification is uni- 
formly bounded, then the functions £ and 7 and the functions x(£, ) and 
y(&, n) of the inverse satisfy Hélder conditions. Since these Hélder conditions 


* J. Schauder, Uber lineare elliptische Differentialgleichungen zweiter Ordnung, Mathematische 
Zeitschrift, vol. 38 (1933-34), pp. 257-282. 

E. Hopf, Zum analytischen Charakter der Lisungen regulérer zweidimensionaler Variations pro- 
bleme, Mathematische Zeitschrift, vol. 30, pp. 404-413. 


128 C. B. MORREY [January 


are so important in the modern theory of elliptic equations, these two theo- 
rems may prove to be an important tool in this field. 

We use the following notation: A function ¢(x, y) is said to be of class 
C™ if it is continuous together with its partial derivatives of the first x 
orders. If E is a point set, E denotes its closure and E* its boundary points. 
If E and F are point sets, the symbol E-F denotes their product, E+F de- 
notes their sum (Z and F need not be mutually exclusive), and E ¢ F means 
that E is a subset of F. The symbol C(P, r) denotes the open circular disc 
with center at P and radius r. 

1. Preliminary definitions and lemmas. Most of the definitions and lem- 
mas of this section are either found in the literature or are easily deducible 
from known results. We include the material of this section for completeness. 


DEFINITION 1. A function u(x, y) is said to be strictly absolutely continuous 
in the sense of Tonellit (A.C.T.) on a closed rectangle (a, c; b, d) if it is con- 
tinuous there and 


(i) for almost all X¥, a<X <b, u(X, y) is absolutely continuous in y on 
the interval (c, d), and for almost all Y, c< Y <d, u(x, Y) is absolutely con- 
tinuous in x on the interval (a, 6) and 

(ii) V.@[u(X, y)] and V,“*[{u(x, Y)] are summable functions of X and 
Y, respectively, V.4[u(X, y)], for instance, denoting the variation on (c, d) 
of u(X, y) considered as a function of y alone. It is clear that these variations 
are lower semicontinuous in the large lettered variables. 

DEFINITION 2. A function u(x, y) is said to be A.C.T. on a region R (or R) 
if it is continuous there and strictly A.C.T. on each closed interior rectangle. 

Remark. Evanst has shown that every continuous “potential function of 
its generalized derivatives”§ is A.C.7. and conversely, so that his theorems 
concerning the former functions are applicable to the latter. Thus 

Lemma 1.§ Jf u(x, y) is A.C.T. on R, then 0u/dx =u, and du/dy =u, exist 
almost everywhere in R and are summable over every closed subregion of R. 

Lemma 2.§ If x=2x(s, t) and y=y/(s, t) is a 1-1 continuous transformation 
of class C’ of a region R of the (x, y)-plane into a region = of the (s, t)-plane, 
and if u(x, y) is A.C.T. on R, then the function u[x(s, t), y(s, t)| is A.C.T. on 
and 


t L. Tonelli, Sulla quadratura delle superficie, Atti della Reale Accademia dei Lincei, (6), vol. 3 
(1926), pp. 357-362, 445-450, 633-638, 714-719. 

t G. C. Evans, Complements of potential theory, 11, American Journal of Mathematics, vol. 55 
(1933), pp. 29-49. 

§ G. C. Evans, Fundamental points of potential theory, Rice Institute Pamphlets, vol. 7, No. 4 
(1920), pp. 252-329. 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 129 


Us = + Uys, Uy = + Uys 
almost everywhere. 


DEFINITION 3. Let D be a region. By the expression almost all rectangles 
of D we mean the totality of rectangles a<x<b, cS y<d in D for which a 
and b do not belong to some set of measure zero of values of x, and c and d 
do not belong to some set of measure zero of values of y. Naturally either or 
both sets of measure zero may be vacuous. 


DeEFtniTIOn 4. Let ¢(D) be a function defined on almost all rectangles D 
of a region R such that whenever D= D,+Dz, where D, and D; are admissible 
rectangles having only an edge in common, we have that ¢(D) =¢(D,) 
+(D2). We say that ¢(D) is absolutely continuous on R if for every e>0, 
there exists a 5>0 such that for every sequence of non-overlapping admissi- 
ble rectangles {D,} with >> [meas (D,)]<6, we have >°|¢(D,)| <e. 


Lemma 3.} Let f(x, y) be summable on R, let D denote a rectangle (a, c; b, d) 
on which f(x, y) is summable (this being almost all rectangles of R) and define 


6(D) = UG, ») fa, lay = ff say, 


¥(D) -f [f(x, d) — f(x, c)|dx = — sax. 


Then a necessary and sufficient condition that f(x, y) be A.C.T. on R is that 
T(x, y) be continuous and $(D) and ¥(D) be absolutely continuous on each sub- 
region A for which A c R. When this is true, 


¢(D) -ff axdy, “W(D) 


for each rectangle in R. 


DEFINITION 5. We say that a function ¢(x, y) is of class L, on a region 
R if |¢| ? is summable over R. 


DEFINITION 6. We say that a function u(x, y) is of class D, on R or R if 
it is A.C.T. there and | ,|* and |«,|* are summable over every closed sub- 
region of R. 


Lemma 4. Let {¢,(x, y)}, n=1, 2,---, and o(x, y) be of class Ly on a 
rectangle (a, c; b, d) with 

t A summable function f(x, y) for which ¢(D) and ¥(D) are absolutely continuous on each sub- 
region A of R for which AC R is said to be a “potential function of its generalized derivatives” and 


this lemma is essentially the theorem of Evans mentioned above (Complements of potential theory, 
loc. cit.). 


130 C. B. MORREY [January 


b pd 


where G is independent of n. Let 


(2,9) = f f f f n)dedn 


and suppose that the sequence {®,(x, y)} converges uniformly to ®(x, y). Sup- 
pose also that {A,(x, y)}, m=1, 2,---, and A(x, y) are of class L, on (a, ¢; 
b, d), g=p/(p—1), and that 


b d 
lim f f |A, — A|%dxdy = 0. 
Then 


b d b d 
b d b d 
(ii) lim f f = f 


Proof. The first conclusion is well known.f 
To prove (ii), choose €>0. For n> N,, we see from the Hélder inequality 
that 


b d 
| f f (A, — A)ondxdy 
b ed 1/q b ad € 


Now, let {Bx (x, y)} be a sequence of step functionsf such that 


b d 
lim f = 0. 
a 


Then, for k>K (independent of n), 


b d 
f f (By — A)ondxdy 


t For instance, this result may be obtained by the method of proof used in Theorem 7, §1 of 
the author’s paper, A class of representations of manifolds, I, American Journal of Mathematics, vol. 55 
(1933), p. 693. 

t To form these, let G, denote the grating formed by all lines of the form x=2-*i, y=2-*j. We 
then define B, to be a properly chosen constant on the part of each square of G; which contains a 
point of the rectangle. 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 131 


b d d € 

d 
b pd bed 1/p € 


Now, let k) > K. Then, for n> 
b d € 
a c i Ri 


the R; being the rectangular subregions of R over each of which B,, is con- 
stant and B,,,; being the value of B,, on R,. 
Finally, if we let NV be the larger of N; and Ne, we see that 


b ad 
Jf 406. 


b d b d 
ff +| f fa- Bu xd | 
b d 
+| 


b d 
+ | ff 
This proves the lemma. 


Lemma 5. Let $(x, y) be of class L, on R, p=1, and let R;, be that subset of 
points (2x0, yo) of R such that all points (x, y) with |x—a0| <h, |y—yo| <h are 
in R. Let 


= 


tth uth 


1 
on(x, y) = — n)dtdn, h>0O, 
4h? z—h y—h 


be defined in Ry. Then x(x, y) is continuous and of class L, on R, and 


(i) s ff olrasay, 
tim | ff | ff | 
‘ 


ii 
= lim ff | — ¢|"dady = 0, hy > 0. 
h-0 Ry, 


132 C. B. MORREY [January 


Proof. This lemma is well known.f 


DEFINITION 7. Let u(x, y) be defined on R or R. If u is of class Dz on R 
or R with | u.|? and | «,|? summable over the whole of R, we define 


D(u) = + uj} )dxdy. 


Otherwise we define D(u) = +. 


Lemma 6. Let {un(x, y)} and {v,(x, y)} each be of class Dz on R with 
D(tn) and D(v,)<G independent of n, and suppose that {un(x, y)} and 
{vn(x, v) } converge uniformly on R to functions u(x, y) and v(x, y), respectively. 


Let the sequences {an(x, y)}, {bu}, {cn}, {dn}, fen}, {fn}, {gn}, {dn}, {hn}, 
and {1,} be measurable and uniformly hounded and suppose the sequences con- 
verge almost everywhere on R to a, b, c, d, e, f, g, h, k, and l respectively. Then 


(i) u(x, y) and v(x, y) are of class Dz on R, 
(ii) D(u) lim inf D(w,), D(v) lim inf D(v,), 


no n— 


(iii) ff 
R 
<lim inf f f 
+ 2+ kntnytln)? |dxdy. 


Proof. Conclusions (i) and (ii) are well known.f 
To prove (iii), let M be the uniform bounds for a,, etc., and let 


Pn = + + Cane + + Cn, = au, + buy + cur + dvy + é, 
Vn = SnUnz + SnUny + + RaVny + hes y = fuz + guy + hv, + koy + 


Then, for each h>0, we see that ¢,”, ¥.™, @™ and y™ are uniformly 
bounded on R,, the proof for ¢,“), for instance, employing the Hélder in- 


equality as follows: 
uth 
f f a,(&, n)Ung(, n)d&dn 
z—h y—h 


(h) 1 
| 
rth uth rth 
+ if f f 
xz—h z—h v 


rth yth rth yth 
+ | f f dnVngdédn f f endédn | | 
z—h v—h z—h y—h 


yth 
+ CnUnedtdn | 
—h 


t See Lemma 1, §1 of the author’s paper, loc. cit. 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 133 


rth yth /2 sth 1/2 
n| nt | 


<—— + M, 


each * denoting a term similar to the first term. By Lemma 4, ¢,, and Waa 
converge at each point to ¢ and y respectively. Hence, 


f = im [ + 
Rr 


no 


< lim inf f f (¢. + 
R 


neo 


using Lemma 5. The statement (iii) follows from Lemma 5 by letting h—0. 


Lemma 7. Let f and ¢ be of class Dz on R, let D be a Jordan subregion of R 
such that D* is a rectifiable curve interior to R on which $ is of bounded varia- 
tion. Then O(f, &)/0(x, y) is summable on D and 


the line integral being the ordinary Stieltjes integral. 


Proof. That the Jacobian is summable follows from the Schwarz inequal- 
ity, since f? +f? +¢2 +9; is summable over D. 

Let f, and ¢, have their usual significance as average functions. If | h|, 
| k| <a, fx and ¢, are defined on D and moreover f; is of class C’ if h>0 and 
¢x is of class C’ if k>O. Hence, by Green’s theorem 


f -f — frybes)dxdy = — 
De D D 


Letting & tend to zero, and using Lemmas 4 and 5 and well known theorems 
on the Stieltjes integral, we see that 


= odf, = Sf — faybz)dxdy. 


The result follows by letting h tend to zero. 
2. Fundamental theorems on transformations. We state first 


DEFINITION 1. We say that u(x, y) satisfies a condition A[A; M(a, d)] 
on R if it is of class D. on R and 


134 C. B. MORREY [January 


ff (u2 + uj)dxdy < M(a, d) (+) , OSrse, 
C(P,r) a 
P = (x, y)eR, A> 0, 
where a>0, d>0, and a+d is the distance of (x, y) from R*, M(a, d) depend- 
ing on a and d and not on (z, y). 
DEFINITION 2. We say that u(x, y) satisfies a condition B[y; N(a, d)] on 
Rif 
r\* 
| — ye) | N(a, d) (<) Osr<a, 
a 


[ (22 — x)? + (y2 — "2, 
provided that every point on the segment joining (x, y:) to (x2, ye) is at a 
distance >a/2+d from R*. 

Lemma 1. Let u(x, y) satisfy a condition A[d; M (@, d)| on R. Then it also 

satisfies a condition B[d/2; N(a, d) |, where 
N(a, d) = [M(a/2, d)]*/*. 

Proof. First assume that u(x, y) is of class C’ on R. Let Pi: (x, yi) and 
P2:(%2, y2) be two points of R which are such that every point on the segment 
joining them is at a distance 2a/2+d from R*. Next, choose axes so that P, 
is the origin, and P, is the point (2-'?-r, 2-1-r). Then each square of the 
form 

0S 0S yS or 
is in a circle of radius rt/2 whose center is at a distance 2a/2+d from R*. 
Let a=2-"/? then 


J + u,2)dxdy M(a/2, d)(rt/a)*, 


IIA 


a—at a—at 


Now, for each (x, y) with OSx<Sa, 0SySa, we have 


1 1 
u(a, a) — u(0, 0) = vf u,(xt, yt)dt + yf uy(xt, yt)dt 
0 0 


—(x- a) f uzla + a) |dt 
0 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 135 


~{9= a) f + — a),a+ t(y — a) 


integrating both sides of this equation with respect to x and y, we obtain 


SS {fe- a) |ath dxdy 


J J x — a)ug(E, — 


the last being obtained by suitable changes of variable; the * denotes the 
term in y or 7 which is similar to the term preceding it. Using Schwarz’s in- 
equality on the interior terms, we obtain 


| u(a, a) — u(0,0)| < 


1 
< 4-3-1/2[ M(a/2, f 
0 


8d-13-1/2[ M(a/2, d) (r/a)*/2, 


If u(x, y) is merely of class D2, u,(x, y) is of class C’ and we obtain the 
general result by letting / tend to zero. 


DEFINITION 3. Let T: £=£(x, y), 7=n(x, y) be a 1-1 continuous trans- 
formation of a closed region R into a closed region 2, which is of class C’ in 
R with &.n,—£,.~0. Let (x0, yo) be a point of R, x=x(c), y=y(c) (o arc 
length) be a regular curve such that x(oo)=x0, y(o0)=¥o, x'(o0) =cos 0, 
y’ (0) =sin 0. If (&, m0) is the point of = corresponding to (xo, yo) and if ds is 
the differential of arc length of the curve in 2 corresponding to the above, we 
define the magnification of T at (xo, yo) in the direction 0 by | ds/d0]. 


Remarks. Clearly this magnification depends only on (x, yo) and @ and 
not on the curve chosen. It is given by 


ds |? 
—| = Ey cos? 6 + 2Fo sin @ cos 0 + Gp sin? @, Ey = E(%o, yo), etc. 


136 C. B. MORREY [January 


The square of its maximum and minimum (with respect to 6) at Po are given 
by 
Eo + Go + [(Eo + Go)? — 4(EvGo — 


4{ Eo + Go — [(Eo + Go)? — 4(EvGo — Fe?) 


respectively, so that the ratio of the maximum to the minimum magnifica- 
tion is 


Eo + Go 
+ 2 — 
Mo + (uo ) Ho — Fe)” 
at Po. If this ratio is uniformly bounded in R, it is clear that the inverse trans- 
formation has the same property. In the foregoing remarks, E, F, and G have 
their usual differential-geometric significance: 


E=§2+7n?, F = + ny, +n). 


THEOREM 1. Let R and > be two Jordan regions, a, b, and c be three distinct 
points of R*, and a, B, and y be three distinct points of =*. Let {T} be a family 
of 1-1 continuous transformations of the form 


T:€ = &(x,y), = 9) 


which carry R into 2, which carry a, b, and c into a, B, and y respectively, and 
which satisfy the following hypotheses: 

(1) each T is of class C’ within R with £.n,—&,nz¥0, and 

(2) the ratio of the maximum to the minimum magnification of each trans- 
formation at each point (x, y) is <K, which is independent of x, y, and T. Then 
there exist functions M(a), N(a), P(a), and m(a) which depend only on K, the 
regions R and &, and the distribution of the points a, b, c, a, 8, and y; and there 
exists a number \} >0 which depends only on K such that 

(i) M(a), N(a), m(a)>0 for a>0, lima.oM(a) =lim,.oN (a) =lim,.oP(a) 
=lim,.om(a) =0, 

(ii) all points of R or = which are at a distance =>p>0O from R* or =* corre- 
spond to points of the other region at a distance =m(p) from its boundary, and 

(iii) the functions (x, y), n(x, y), n), and all satisfy conditions of 
the form A[2; M(a, d)] and B[\; N(a, d)] with M(a, d)=M(a), N(a, d) 
= N(a), and the equicontinuity condition 


| B:) — B2)| Pla), — ar)? + (82 — = a, 
(a, B) = (x, y) or (é, n), 6 = &, n, x, ory. 


z 
j 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 137 


Proof. Since the ratio of maximum to minimum magnification is <K, it 
follows that 


f (§? + +2 + ny )dxdy = 2Km(2), 
R 


ff cee + 28 + + 2Km(R). 


From this,f follows the existence of the functions P(a) and m(a) satisfying 
the desired conditions. 

Now, let Po belong to R, for example, being at a distance a from R*. 
The circle (x—2»)?+(y—wo)?<a? is carried into a Jordan subregion of = 
which is surely a subset of the circle (—£)?+(n— ms [P(a) ]?, (£0, 70) be- 
ing the correspondent of (xo, yo). Let 


r 2r 
A(r) = ff J(x, y)dxdy = f f pJ (xo + pcos 6, yo + p sin @)dpdé, 
C(Po.r) 0 0 


O<rs<a, y) =| — 
Then 


= = (fo (xo + rsin @, f + 


(2 + n2)ds = f [é2 + 
K Jo 


= (2Knr)- 0O<r<a; 


here, s denotes arc length on the circle (x—2»)?+(y—yo)?=r?, and C, is the 
curve in 2 into which this circle is carried. Thus 


A'/A 2 2/(Kr), A(a) w[P(a)]? 


and hence 
Since 
J(x, y) 2 (1/2K)-(€2 + +2 
we find that 


f f (2 +82 +2 + np)dady 


t+ See the author’s paper, An analytic characterization of surfaces of finite Lebesgue area, I, Ameri- 
can Journal of Mathematics, vol. 57 (1935), Theorem 1, §2, p. 699. 


4 

| 

j 

2 
r 


138 Cc. B. MORREY [January 


Hence, we see that (iii) is satisfied (remembering Lemma 1) if we choose 
=1/K, M(a) = 2Kx[P(a)]?, N(a) = P(a/2). 


DEFINITION 4. We say that a Jordan region R and three distinct points 
a, b, and ¢ of R* satisfy a condition D(L, do) if (1) the distances ab, ac, and be 
are all >d,>0, (2) R* is rectifiable, and | (3) if P,; and P2 are any two points 


of R*, the ratio P,P, <L, where P,Poi is an arc of R* joining P; and 
which contains at most one of the points a, 5, or ¢ in its interior. 


THEOREM 2. Let the regions R and 3, the points a, b, c, a, B, and y, and the 
family {T} of transformations satisfy the hypotheses of Theorem 1 and suppose 
(R; a, b, c) and (3; a, B, y) satisfy a condition D(L, do), do>0. Then the con- 
clusions of Theorem 1 hold and, in addition, there exists a number M depending 
only on K, L, dy and the areas m(2) and m(R), and a number p>0 depending 
only on K and L such that 


f f (2 +82 +2 + Mr, 
C(Po.r)-R 


ff + x7 + y? + S Mr’, 
C(Po.r): 


for any point Po in the plane. 


Proof. We need to prove only the last statement. Let Po be a point in 
the plane and let 0<r<d)/2. Then the set C(Po, r)-R is vacuous or consists 
of a finite or denumerable number of Jordan regions 7, the boundary of each 
of which consists of (1) a finite or denumerable set of arcs of C*(Po, r)- R, 
(2) a finite or denumerable number of arcs of R*, and (3) points of R* which 
are limit points of all of these arcs. All the points of (2) and (3) are on one 
of the arcs abe, bca, or cab of R*, say abc. Clearly 7,* and r,* have at most 
one point in common if »¥n’. Let E,,, be the set (1) above for each r,, let 
let o, be the region of corresponding to rn, let on, let 
C,.n=0,*, let C,-=>—C,,n, let T,,, be the totality of arcs corresponding to E,,n, 
and let T,=>-T,,n. Clearly C,..-C,,.: is at most one point if n¥n’. Let 
U(C,) =SU(C,,n), let U(T',,.) be the sum of the lengths of all the arcs of T,,, 
and let /(T',) =>-i(T,,n). It is clear that 


[1(C,) = 4xm(o). 


Consider an r,, and the closed set R*-r,*. Proceeding along the arc (abc), 
there is a first point P, and a last point Q, of this set. Then, there is an arc 
of E,,, joining P, to Q,. Hence if II, and K, are the corresponding points 


= 

= 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 139 


of =*, they are on the arc apy and there is an arc of I’,,, joining them. Now 


U(II,, Kn) (arc of apy) <L times the length of this arc of I,,,. Hence it is 
easy to see that 


1 
UT, = 


1 
UT rn) 2 Z UC,,n) 5 +1 (C,). 


+1 


Any of the above sets may be null and for a set of measure zero of values of r, 
W(C,) and I(T) may be infinite. 
We may now proceed as in Theorem 1. Let 


A(r) = ff J(x, y)dudy = f lof J (xo + pcos 6, yo + psin | dp, 
C(Pyr)-R 0 E, 
OSrSdo/2, y) =|Em — Enel, 


(the integral being zero if the field of integration is null). Then 


1 
A'(r) =r] yo +rsin = + n2)ds 


2 
E, 
> (2Kar)-(L + 1)-*[(C,) ]* = 2K-(L + 1)-*r-A(r). 


Also A <m(Z). Hence as before 
A(r) S 


IV 


ff + + ne + nz )dxdy < 
C(Por) 


Since A (r) <m(2) for all values of r, the theorem follows. 


Lemma 2. Let a, bi, be, c, d, and e be measurable functions defined on a 
bounded region R with |a|, |b:|, |b2|, |c|, |d|, |e] SM 21 on R. Then there 
exist sequences {an}, {bin}, {ben}, {cn}, {dn}, and {en} which are analytic 
on R and uniformly bounded and which converge almost everywhere on R to 
a, b;, be, c, d, and e respectively. If b:=be, we may choose bi,n=be,n for each n. 
If b} =b.=b and ac—b?=m>0 on R, we may choose the sequences so that there 
exist numbers M and m>O such that 


lan], |dn|, |enl, [dnl], SM, — 52 = 


Further, if ac—b? =1, the sequences may be chosen so that a,c,—6,? =1. 


Proof. Let D be a region containing R in its interior and define a=c=1, 


ic! 
¥ 
ft 
i 


140 C. B. MORREY [January 


b, =b,=d=e=0 in D—R. For h sufficiently small, ay, bin, ber, Ca, da, and en 
are defined and continuous in a region containing R and all are numerically 
<M. 

Now suppose 6, = b,=6, ac—b?=m>0. We know that ac—b? is the prod- 
uct of the maximum by the minimum (for 0 <@<2r, (x, y) fixed) of 


f(x, y; 0) = a cos? 6+ 26 sin 6 cos 6+ sin? 
= 4[(a +c) + (a — c) cos 26 + 26 sin 26]. 


Clearly | f(x, y; 0)| <2M so that the minimum above =m/2M. Thus 
ancn — b? = m*/4M* > 0 
for each h>0 and all (x, y) in D,. The remainder of the proof is obvious. 


Lemma 3. Let a, b, and c be analytic in a region G which contains the Jordan 
region R in its interior and suppose |a|, |b|, |c| <M, ac—b?=1, a>0; let = 
be another Jordan region. Then there exists a unique 1-1 analytic map 
E=(x, y), n=n(x, y) of R on 2 which carries three given distinct points p, q, 
and r on R®* into three given distinct points 7, x, p (arranged in the same order) 
on =*, and which satisfies 


(dé, + cky), + béy. 
The Jacobian does not vanish and the ratio of the maximum to the minimum 
magnification of the transformation is <M at each point. 


Proof. Let D be a Jordan region contained in G and containing R whose 
boundary is a regular, analytic, simple, closed curve. It is knownf that there 
exists a solution X(x, y) of the equation 


0 
— (aX, + bXy) + — (6X. + cX,) = 0 
Ox oy 


which takes on the values X =x on D*, which is analytic on D, and whose 
first derivatives do not vanish simultaneously. Clearly there exists an analytic 
conjugate function Y (x, y) which satisfies the same equation and the relations 


Y,= — (6X.+cX,), VY, = aX, + dX,y. 


The equations X = X(x, y), Y=Y(x, y) yield a 1-1 analytic map of (D and 
hence) R onto a region A which carries p, g, and r into three points 7’, x’, 
and p’ arranged on A* in the same order, and for which X,Y,—X,Y.+0. If 
§=2Z(X, Y), n=H(X, Y) is the conformal map of A on = which carries 
nm’, x’, and p’ into 7, x, and p (respectively), it is easily seen that 


Tt See Lichtenstein, loc. cit. 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 141 


y) = =[X(x, y), Y(x, y)], n(x, y) = H[X(x, y), Y(x, y)] 


is a mapping of the desired type. That this mapping is unique follows from 
the fact that if £=£’(x, y), 7=7’(x, y) is another such map, then (£’, n’) are 
related to (£, 7) by a conformal transformation with 7, x, and p fixed; thus 

Lemma 4. Let £=£(x, y), n=n(x, y) be a transformation defined on a re- 
gion R in which the functions & and 7 are of class Dz. Suppose also that 


Nz = — by, Ny = 


almost everywhere in R. Then the above map is conformal, i.e., & and are con- 
jugate harmonic functions. 


Proof. Let D be a rectangle on the boundary of which n(x, y) is absolutely 
continuous, such rectangles being almost all rectangles in R. Then, from 
Lemma 7, §1, it follows that 


1 
L(Sp) = f f (BG = — f + 
D D 


f f f 
D D* 


Sp being the surface §=£(x, y), n=n(x, y), (x, y)eD, L(Sp) meaning its 
Lebesgue area. Since the Geécze areat G(S) of any surface with this boundary 
curve must be at least as great as L(Sp), and since L(S)=G(S) for every 
surface, we see that Sp is a surface of minimum area bounded by its boundary 
curve. Hence (x, y) and n(x, y) must be harmonic, since otherwise they 
could be replaced by the harmonic functions having the same boundary val- 
ues to form a surface of smaller area bounded by the boundary of Sp. 


THEOREM 3. Let R and & be Jordan regions, let p,q, and r be distinct points 
on R* and let x, x, and p be distinct points arranged in the same order on =*. 
Let a, b, and c be bounded, measurable functions defined on R: 


la], |o|, s ac—b?=1 


Then there exists a 1-1 continuous transformation = (x, y), n=n(x, y) of R 
into 

(i) which carries p,q, and r into 1, x, and p respectively, 

(ii) which is such that (x, y), n(x, vy), x(&, n), and y(E, n) are of class Dz 
on Rand 3, x=x(E, n), y=y(E, 9) being the inverse transformation, 

(iii) in which the conclusions of Theorem 1 apply to the functions (x, y), 


Tt See the author’s paper, loc. cit., pp. 696, 698. 


4 

if 

| 


142 C. B. MORREY [January 


n(x, y), x(t, »), and y(t, ») and in which those of Theorem 2 also apply if 
(R; p, 9, r) and (3; 7, k, p) satisfy a condition D(L, do), and 
(iv) in which the functions §(x, y) and n(x, y) satisfy 


nz = — + céy), Ny = af, + 
almost everywhere on R. 


Proof. Let {a,}—a, almost everywhere on R, the dn, bn, 
and c, being analytic on R and satisfying 


| an |, | bal, M, 62 = 1. 


Let Tn: £,(x, ¥), n=n(x, y) be the unique analytic transformations of R 
into 2 which carry p, g, and r into 7, x, and p respectively and which satisfy 


- (bnénz + Cntny) » Nay = Ontnz + 


Let x=2,(E,7), =¥n(&, n) denote the inverses. These transformations satisfy 
the conditions of Theorem 1 and hence the conclusions of Theorem 1 and 
also of Theorem 2 if (R; p, g, r) and (3; 7, x, p) satisfy a condition D(L, do); 
it is easily seen that the ratio of maximum to minimum magnification of T,, 
is SAn+(An?—1)"?, An =(Gn+cn)/2, and is therefore <2M. Hence a subse- 
quence {m,} of the integers {x} may be chosen so that {&,}, {mai}, {an,}, 
and {y,,} all converge uniformly on R and © to functions £(x, y), n(x, ), 
x(é, n), and y(é, ») respectively and x=x(é, 7), y=y(&, 7) is the inverse of 
T: £=£&(x, y), n=n(x, y). Clearly T is a 1-1 continuous transformation of R 
into ¥ in which 9, g, and r correspond to 7, x, and p respectively. 

Also, since the ratio of maximum to minimum magnification <2M, we 
have 


f f < 4Um(2), 
R 


f f (xne + any + Yat + S 4Mm(R), 


so that it follows from Lemma 6, §1 that £(x, y), n(x, y), x(, ), and y(&, ») 
are all of class D. on R and 2. Using the same lemma, we see that (iii) also 
holds and that 


R 
lim inf ff + bnénz Cnény)? + (nny Ontnz bnény)? |dxdy = 0, 
R 


so that (iv) is demonstrated. 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 143 


Tueorem 4. Let (R; p, g, 7), (2; 7, «, p), a, b, and c satisfy the hypotheses 
of Theorem 3 and let T: £=£&(x, y), n=n(x, y) be the transformation derived in 
that theorem. Then T enjoys the following further properties: 

(i) sets of measure zero and hence measurable sets of R and > correspond, 
and (&.ny—£ynz) and (xzy,—Xqye) are defined and +0 except possibly on a set 
of measure zero in R and = respectively; 

(ii) if d(x, y) and W(E, n) are summable on measurable subsets DER and 


ASX, the functions 1), y(é, and n(x, y)] 
are summable on T(D) and T-(Q) respectively, and 


f n), ¥(E, 0) — = f J 


Sf — tynz)dxdy = Jf sean 


(iii) if d(x, y) and W(E, n) are of class Dz on R and > respectively, then 


tively and 


be = Oy = t Gyn, G2 = Gest Gonz, Sy = + 


almost everywhere; 
(iv) T is uniquely determined by (i), (ii), and (iv) of Theorem 3. 


Proof. Let O be an open set in R, let O=)_;_,R; where the R; are closed 
non-overlapping rectangles on each of which n(x, y) is absolutely continuous, 
and let S; be the surface S;: £=&(x, y), n=n(x, y), (x, y)eR;:. Clearly L(S;) 
is merely the measure of the closed or open region; or D; into which R; or R; 
is carried, 2* being a rectifiable curve. From Lemma 7, §1, it follows that 


> Fed R;* R; 
Thus £.n,—£,n: 20, and if 2=T7(O), then 
m(Q) = ff (E2ny — Eynz)dxdy. 
It follows very easily that a measurable set E in R is carried into one in = and 


o[x(é, and p[E(x, y), n(x, y)] are of class Dz on and R respec- 


144 C. B. MORREY [January 


The same proof establishes the fact that a measurable set A in ~ is carried 
into a measurable set D in R and that 


m(D) = f J — 


Hence the rest of (i) follows easily and we have proved (ii) for the case 
g=y=1. 

Suppose ¢, for instance, is bounded. Let {¢,(x, y)} be a sequence of 
step functions which are uniformly bounded and converge to ¢ almost 
everywhere. It is clear that the functions ¢,[x(&, 7), y(é, 7) 
and ¢[x(, 7), 7) are all summable and dominated by a 
summable function, and, for each , it is clear from the above that 


f f = f 
D T(D) 


The result (ii) for ¢ follows by a passage to the limit, since ¢, [x(€, 7), v(£, 7) ] 
converges almost everywhere on = to ¢[x(£, 7), y(&, 7) ] and the latter func- 
tion is certainly measurable. If ¢ is merely summable, let ¢:.=¢ where ¢20, 
¢:=0 elsewhere, where and ¢2=0 elsey tere, let if 
din=N elsewhere, =¢2 if d2<N, =0 elsewhere. Then it fol- 
lows that the functions 


Gi,n(XEV_ — XqVe), $2,n(XtV_ — 


form monotone non-decreasing sequences of non-negative summable func- 
tions converging almost everywhere to and Ve) 
spectively, their integrals remaining bounded. Hence (ii) follows. 

To prove (iii), let (x, y) be of class Dz on R, let G be a subregion of R 
for which Ge R, and let I’ be the corresponding subregion of 2(I' c = clearly). 
Now, let A be a rectangle of I along which x(é, n) and y(é, 7) are absolutely 
continuous. Then, using Lemma 7, §1, we see that 


J ole n), y(é, 0) = y)dn = fon — oynz)dxdy, 


Clearly these relations follow for any rectangle A of G so that ¢ is A.C.T. 
as a function of — and 7 by Lemma 3, §1. Using the same lemma, it follows 
that 


o = o Eynz)dxdy = on Pynz)adxdy, 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 145 


0g 0d 
ff — didn = ff —-(Esny £ynz)dxdy ff (— + o,f z)dxdy, 
2 On o On oO 


for every open set 2 in > with Qe L. Hence, almost everywhere in R and >, 


Since £,n,—£&:#~0 except on a set of measure zero, it follows that 
oz = t dy = + dary, 


almost everywhere on R and ~. Setting ¢=- and y in turn, we obtain the 
relations 


+ = + = 0, + = 0, + = 1 


almost everywhere, from which it follows that 


= + dye, by = + 


almost everywhere on R and 2; the relations for y are proved similarly. It 
follows that 


(E2my — — = 1 


almost everywhere on R and >. 
To show that ¢ is of class D2 in (¢, 7) we see that 


+ 7) -(Eny £ynz) = ag? + + co; = 2M (¢2 + 


using the relations above and the relations —(bt.+cé,), n,=aé.+0é,, 
ac—b?=1. The fact for y as a function of (x, y) can be obtained from the 
above and the fact that 


1 
ay? + + ou + ¥/) 


To show (iv), let £=£'(x, y), »=n'(x, y) be another map of R into = 
satisfying (i), (ii), and (iv) of Theorem 3. Then ¢’ and 7’ are of class Dz in & 
and 7 and all the rules for differentiation apply. We see that nj =é{ and 
nt = —& almost everywhere on so the map £’=£’(é, n), n’=n’(é, n) is 
conformal by Lemma 4. Hence ’=£, n’=7. 

3. A special elliptic system of partial differential equations of the first 
order. We prove first 


THEOREM 1. Let D(x, y) and E(x, y) be of class Lz on &,=C(0, 0; a) such 
that 


2 
4 


= 


146 C. B. MORREY [January 


ff D?dxdy = Mp’, ff E*dxdy = Mp’, A> 0. 
C(P,p)+2a C(P.p) +2, 


Then the function 
— x)D(E,n) + (n — y) ECE, 0) 
U(x, y) = 
— x)? + (n — y)? 
is defined and continuous over the whole space, and satisfies a uniform Holder 
condition of the form 
| U(«1, 91) — y2)| NP?2, N = 12(2eM)*2[(2 — + + 


dédn 


and 
| U(x, y) | S 21 + (a, y)eBa. 


Proof. Define D(x, y) =0 for x?+y? 2a? and let 


LJ. 


(€ — x)-D 
= dédn. 


h(x, y;7r) = nn | D(x + r cos 8, y + rsin 6)-cos 6| dé. 
0 


Let 


Then 


2r 2 
h*(x, f | D cos ao. 
0 


By the Hélder inequality h?(x, y; r) is summable on any interval 0<r<h, 
since 


h 2r h 2r 
f f D*dédn = 24 f f rD*drd@ = 24 f al D? cos? dr 
C(z,ysh) 0 0 0 0 


h 
=f h?(x, y; r)dr. 
0 
Thus 


h 
f h?(x, y; r)dr S 
0 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 147 
If we let H(x, y; r) =S-h(x, ; p)dp, we find by the Hélder inequality that 
A(x, r) (29M) | 


If N is so large that =, is in the circle C(x, y; NV), then H(x, y;r) <(2xMNa)¥?2 
for r= N. Hence 


J J, h(x, nar 


1 
y;p) — (x, y; 6) + —f y; r)dr 


S (1 + 


for every «>0, and every p>0. Hence U(x, y) is defined. 

Now, let P; and P2 be any two points and let P be the midpoint of the 
segment P,P2. Let p= P,P, /2 and let a circle o(kp) of radius kp, k>1, be de- 
scribed about P as center. Let a be the inclination of the segment P;P2. Then 


(é x2)D(é, n) 
— x2)? + (n — 


ff (€ — n) 
o(kp) — + — 91)? 


p 26 — a)D 
+f Sf cos (20 — a)D(é, 7) atin | as, 
—p W—o(kp) 


where 


r?=(—-Z—s-cos a)?+(n—F—s-sin a)?, 0=tan-! 


Thus 


(k+1)p 


(k+1)p 
| — Ux(P2) | <f x1, r)dr +f yo; r)dr 
0 


0 
+ f | f y(s); rar] ds 
(k—1)p 


3 
r!27[x(s), y(s); r]drds 
—p 2 


< + + [2 — 
(k = 2, x(s) = Z+s cosa, y(s) = sina). 


| 
&-—Z—s-cosa 
| 


148 C. B. MORREY [January 


The result follows by using the similar result for U —U,. 
Lemna 1. Let D(x, y) and E(x, y) be of class Lz on D, with 


ff D°dxdy < Mp’, ff E*dxdy < Mp’, X¥>0, p>O. 
C(P.p)-=, | 


Then there exist sequences {D,(x, y)} and {E,(x, y)} of functions analytic 
on >, satisfying the above condition with the same M and x, and such that 


lim ff [(D, — D)? + (E, — E)?]dxdy = 0. 


Proof. Define D(x, y) = E(x, y) =0 for x?+y?2a?, and let D, and E, be 
the usual average functions. Then D, and £, are continuous on >, and 


lim ff [(D, — D)? + (Ex — E)?|dxdy = 0, 


hao 


by Lemma 5, §1. Moreover for each h>0 


1 rth uth 2 

f f Dedxdy = f f | f Dita dxdy 

| dxdy 

4h? z—h y—h 

1 h h 

= —| f Sf Diddy] didn 

4h? —h 


the same being true for E. The lemma follows easily from this. 


Lemma 2. Suppose that D,, E,, D, and E are all of class Lz on Xa, satisfy 
the condition of Lemma 1, M and d being independent of n, and the condition 


lim ff [(D, — D)? + (E, — E)?|dxdy = 0. 


Then the functions 


(é + (n y) En 
= 


converge uniformly on any bounded plane region to the function 


U(x, y) (E x)? + (n didn 


dédn 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 149 


Proof. Since the U,(x, y) are equicontinuous on any bounded set (in fact 
the whole plane) it is sufficient to prove the convergence at each point 
Po: (xo, Vo). 

Choose ¢>0, and choose po so small that 


1 (é x0)Dn + (n yo) En | 
déd: 
— xo)? + (n — yo)? 


1 _ D _ E 
—ff (E — + (n — Yo) 
— Xo)? + (n — Yo)? 2 


+ 


Then there exists an Ny such that for »>No, 


| 


(€ — x0)(Dn — D) + (n — yo)(En — E) 
2a J J — x0)? + (n — yo)? 


1 


f f [(D, — D)? + — 


1/2 € 
S LSS [(D, — D)? + (E. — | < 


This proves the lemma. 


DEFINITION 1. Let D and E be functions of class Lz on 2,. Then if « 
and v are functions of class Dz on 2a, we define 


K(u,2) = D) + + 


J(u) = Sf. [(u2 + D)? + (uy + E)?|dxdy. 


If the functions D and E have subscripts, we shall denote the integrals formed 
by using the new functions by putting the same subscripts on J and K. 


Remarks. It is clear that J(u+f) =J(u) +2K(u, £)+D(¢). Also 


J(u) = 2D(u) + 2f f (D? + E*)dxdy, 


D(u) 


IA 


2I(u) + 2 f f (D? + E%)dxdy. 


Accordingly, if H is the harmonic function which takes on the same boundary 
values as u, we see that 


7 

a 

a 

| 

| 

| 


C. B. MORREY 


J(u) < J(H) 2D(H) +2 f f (Dt + EX)dxdy; 


D(u) < 4D(H) + 6 f f (D? + E*)dady. 


THEOREM 2. Let D and E satisfy the conditions of Lemma 1, and let u* 
be a continuous function defined on =* for which D(H*) is finite, H* being the 
harmonic function which takes on the given boundary values. Then there exists a 
unique function u of class Dz on Xq which takes on these boundary values and 
minimizes J(u) among all such functions. The function u(x, y) is given by 

u(x, y) = H.(x, y) + U.(x, 9), 
1 (€—«)D+(n— y)E 
U.(x, y) didn, 
Jz, — x)? + (n — y)? 


H.(x, y) being the harmonic function which takes on the boundary values u* — U4. 


Proof. Let {D,} and {£,} be sequences of functions, analytic on 5, and 
satisfying the conditions of Lemma 2 with respect to D and E. Let U(z, y) 
be any function of class D2 on >, taking on the given boundary values and 
let u,(x, y) be the unique solution, for each n, of 


Au, + Daz t Eny = 0 (AU = + 


which takes on the given boundary values; each u, is the minimizing function 
for J,(u). Then J,(U)—J(U), J.(U) n(un). Thus J(U) n(un). 
On the other hand, the functions u,(x, y) are given by 
Un(X, y) = y) + y); 
1 — x)D, + (n — y)E, 
Uan(%, ¥) = — dédn, 


where U.,,(x, y) tends uniformly to U(x, y) so that H.,.(x, y) tends uniformly 
to H,(x, y) and hence u,(x, y) tends uniformly to the above u(x, y). Since 
D(u,) is uniformly bounded, u(x, y) is of class D, and J(u) Slim infn..Jn(un) 
by Lemma 6, §1. Hence u(x, y) minimizes J(u). 
Let ¢ be of class D, on E, and zero on *. Then 
J(u +d) = J(u) + 20K (u, + WDE). 
Since u(x, y) gives J a minimum, the middle term must vanish. Thus 


J(u + §) = J(u) + DY) > J(u) 


unless ¢=0. 


150 [January 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 151 


Remarks. If we let ,.(x, y) be the minimizing function which is zero 
on 2%, we see that it is of class Dz. If D, and £, are as in Lemma 2, it is 
clear that 1,0. converges uniformly to m,. and D(u,c,.) is uniformly 
bounded. 


THEOREM 3. Let D and E satisfy the conditions of Lemma 1 and let u be 
the minimizing function for J(u) which takes on given continuous boundary val- 
ues for which J(u) is finite. Then there exists a unique function v(x, y) which 
is of class Dz on Xa, vanishes at the origin, and satisfies 


V2 = — (uy + £), ty=uz+D 


almost everywhere. Any other function V(x, y) which is of class Dz and satisfies 
these relations, differs from v by a constant. The function v(x, y) is given by 
v(x, y) y) + y); 
1 (§ — x)E— (n— y)D 
V4, y) = — f dtdn, 
where K(x, y) is the conjugate of the H,(x, y) of Theorem 2. 
Proof. Choose {D,} and {£,} asin Lemma 2, let u,(x, y) be the minimiz- 
ing function for J,(U) with the given boundary values, and let v,(x, y) satisfy 


(tny + E,),; Uny = Unz + D.. 


Then 
n(x, ¥) = Kayn(x, y) + Van(*, 
1 — x)E, — (n — 
It is seen by well known methods of differentiating the functions U.,, and 
that 


on >,. Thus it is clear that Ka,n(x, y) is the conjugate of H,,,(x, y) for each n. 
Since we saw that H,,,(x, y) converged uniformly to H.(«, y) in Theorem 2, 
and since Va,n(x, y) obviously tends uniformly to V,(x, y), it is clear that 
Ka,n(x, y) tends uniformly to K.(x, y) and 2,(x, y) tends uniformly to the 
above »(x, y) on each closed subregion of Z,. Also D(v,) =J(un) and hence 
D(v,) is uniformly bounded. 

Now, let R be a closed subregion of 2.. Then 2(x, y) is of class D, on R 
with D(v) <lim inf,..D(v,) which are uniformly bounded. Hence, by Lemma 
6, §1, 


. 

; 


152 C. B. MORREY [January 


Jf + uy + E)? + (vy — uz — D)*\dxdy 


IIA 


ne 


lim inf ff + Uny + En)? + — Une — D,)?|\dxdy = 0. 
R 


Hence 2(x, y) satisfies the desired relations almost everywhere. Now if V is of 
class D; and satisfies the same relations almost everywhere, Vi—v2=Vy—2y 
=0 almost everywhere so that V —7 is a constant. 


THEOREM 4. If D and E satisfy the conditions of Lemma 1 and u(x, y) and 
v(x, y) are of class Dz on =, and satisfy 
v2 = — (uy + £), = u,+D 
almost everywhere, then 
u(x, y) = y) + Ud(x,y), y) = Ka(x, y) + V.(x, 9), 
where U, and V, have their previous significance and H, and K, are conjugate 


harmonic functions. 


Proof. Let <a so that v is absolutely continuous on =, this being true 
for all values of b<a excepting those in a certain set of measure zero (using 
Lemma 2, §1). Then if ¢ is of class D. on 2», we have 


ff — Syz)dxdy = f 
=, 


Hence if ¢ is also zero on 2, we see that 
ff + (ty + E)¢y|dxdy ff — = 0. 


Thus u is the minimizing function for J(u) on 2, with J(u) finite, and v(x, y) 
is its “conjugate” as in Theorem 3. Hence 


1 y)E 


H,(x, y) + U.(x, y); 


u(x, y) 


1 (§ — x)E— (n— y)D 
v(x, y) = Ky (x, +—f 
K x(x, y) + 9). 
Clearly H, and K;, are independent of }, and, since H/ and Ky are conjugate 
harmonic functions, it is easily seen that H, and K, are. 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 153 


4. A more general linear elliptic pair of partial differential equations. In 
this section, we consider the pair of equations 


(1) V2 = — + cuy +e), Vy = au, + du, + d, 


where we assume that a, b;, be, c, d, and e are measurable on a bounded 
Jordan region R with 


(2) | al, | | del, | cl, | ad], |e] M, 4ac — (61 + >m>0, 


THEOREM 1. Let R be a bounded Jordan region and H* a function continu- 
ous on R and harmonic on R for which D(H*) is finite, and suppose d=e=0 
on R. Then there exists a unique pair of functions u(x, y) and v(x, y) 

(i) which are of class Dz on R and R respectively, 

(ii) which satisfy (1) almost everywhere on R, and 

(iii) for which u(x, y) =H*(x, y) on R* and v(x, yo) =0. These functions 
also satisfy 

(iv) conditions A[2X, M'(a, 5)| and N’(a, 5)] on R where depends 
only on M and m, and M'(a, 6), N’(a, 5) depend only on M, m, R, and the 
maximum of |H*| on R, and the maximum of |u| depends only on the bound 
for | H*| on R*. 

Proof. Approximate to a, bi, be, c, d, and e by sequences {an}, {bin}, 
{ben}, {en}, of functions analytic on R for which 


| an|, | | be,n|; | Ca | (Bin + be,n)? m> 0, 


and approximate to H* by harmonic functions H,* for which D(H,*) <G and 
such that the solution u, of 


(3) (GnUnz + Di nUny) +— (b2,nUnz + = 0 
Ox Oy 


which coincides with H,* on R* is analytic on Rf for each n; M, m, and G are 
independent of m. For each n, there exists a unique function 2,(x, y), with 
Yn(X0, Yo) =O, which satisfies 


2:2 >= (b2,nUnz + 


Uny = + DinUny- 


(4) 


Define A,(x, y), Bn(x, vy), Cr(x, y) by 


t This can be done by taking a sequence of regions Rn, each bounded by analytic curves, which 
closes down on R and then assigning proper analytic boundary values on R,*. That the solutions of (3) 
exist under these conditions follows from the results stated in the article by L. Lichtenstein on the 
theory of elliptic partial differential equations in the Encyklopidie der Mathematischen Wissen- 
schaften, vol. II 3?, pp. 1280-1334. 


7 
| 
| 
| 
| 


154 C. B. MORREY [January 


+ 24ndi nUnzUny + (1 + bin) 


2 
+ (bin + b2,n)UnzUny + 


+ + bi 1) tn + 


+ (b1,n + UnzUny + CrUny 


(5) B, = 


(1 + + + 


+ + be.n)UnzUny + 


where these expressions are defined; otherwise, let A,=C,=1, B,=0. It is 
easily verified that A,, B,, C, are measurable for each m and that 


— Bn = 1, | An|, | Bal, | Cal m), 
(6) + Uy) S$ + 2B,U LU, + CaUny S + U)), 


where K, lJ, and L depend only on M and m@ (U is any function of class D, 
on R). 

Let p, g, and r be three distinct points on R* and 7, x, and p be three dis- 
tinct points arranged in the same order on >*, the boundary of the unit 
circle. Map R on © by functions §=£,(x, y), 7=n.(x, y) where p, g, and r 
correspond respectively to 7, x, and p and where 


(7) (Brénz + Chul, Qny = + 


Using the relations (4), (5), and (7) and Theorem 4, §2 we find that u,(é, 7) 
and 2,(£, 7) are of class on and satisfy ang = — tng, Ung = Ung almost every- 
where. They are therefore conjugate harmonic functions (in case un2=Uny =0, 
it is clear that v,2=?,y=0 at almost all of these points; hence at almost all 
corresponding points, Ung =Ung=Ung=Un,g=0; this takes care of points for 
which A,=C,=1, B, =0). Since the transformations (7) are equicontinuous 
both ways, a subsequence {m,} of the integers may be chosen so that the 
corresponding transformations and their inverses converge uniformly to a 
certain 1-1 continuous transformation of R into = and its inverse. Thus the 
sequences {u,,(£, 7) } and hence {w#n,(x, y)} converge uniformly to certain 
functions u(£, 7) and u(x, y) respectively, u(x, y) coinciding with H* on R* 
and u(£, 7) being harmonic; and the functions 2,,(&, 7) converge uniformly 
on each closed subregion of = to v(é, 7), the conjugate of u(t, 7), and so 

+ It may be shown that u=1,(x, y), 2=%n(x, y) carries Rintoa region on a finite sheeted Riemann 


surface. The transformation (7) merely amounts to mapping this Riemann region conformally on the 
unit circle 5 in a (, 7) plane. We shall use this transformation in the proof of Theorem 2 where this 


interpretation is not valid. 


— 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 155 


Vn,(%, VY) converge uniformly on each closed subregion of R to a certain func- 
tion v(x, y). 
It is easily verified that (see the proof of Theorem 4, §2) 


fav: + 2B,U.U, + = f (U2 + 


for any function U of class Dz (in either (x, y) or (£, 7)), D and A, being cor- 
responding regions of R and &. Since each u, is harmonic, it follows that 


f f (aly + f f + He )dédn < L-G, 
z > 4 


where L is the constant in (6) (using relations (6) and (8)). Hence, by (6) 
and (8), D(u,) <L-G/l (independent of m). Thus u(x, y) and 2(x, y) are of 
class D, on R and R respectively. Furthermore, by Lemma 6, §1, it follows 
easily that 


ff [(v2 + bots + cuy)? + (vy — aus — byuy)?|dxdy = 0. 
R 


Thus the existence and properties (i), (ii), and (iii) are established. 

To show (iv), define A, B, and C by (5) in terms of a, bi, be, c, and u, 
and perform the transformation (7). The relations (6) hold with M and m 
replacing M and m, and u(x, y), v(x, y) are carried into conjugate harmonic 
functions. By Theorem 4, §2, the functions &(x, y) and n(x, y) satisfy condi- 
tions A[2A, M(a)] and B[A, N(a)] where \ depends only on M and m and 
M(a) and N(a) depend only on R, (p, g, 7), (7, x, p), M, and m. Thus it is 
easy to see how to complete the proof of (iv). 

Now, suppose U, V is another pair of functions obeying the conclusions 
(i), (ii), and (iii). Then U and V, where U=U—u, V=V —1, satisfy these 
conditions with H* =0. By defining A, B, and C by (5) in terms of U, a, by, be, 
and c, and performing the transformation (7), we see that U(é, n) and V(&, 7) 
are conjugate harmonic functions for which U(é, n) =0 on 2* and V(é, m0) 
=0. Thus U=V=0. 


THEOREM 2. Let R be a Jordan region and p, q, and r be distinct points 
on R*; we assume that (R; p, q, 7) satisfies a D(A, do) condition. Let H* be con- 
tinuous on R and harmonic on R with D(H*) finite. Suppose a, by, be, c, d, and e 
satisfy (2) where d and e are not necessarily zero. Then the conclusions of Theo- 
rem 1 hold except that (iv) is replaced by 

(iv), « and » satisfy a condition B[\, N’(a, 5)] on R and |u| <P on R, 


(iv)p ff (u2 + u2)dxdy = N 
D 


a 


156 C. B. MORREY [January 


for each closed subregion D of R; here » depends only on M, m, A, and dy, 
N’(a, 5) and P depend only on these, on (R; p, 9,7; 7, k, p), and on the maximum 
of | H*| on R*, and N depends only on these and on the distance of D from R*. 


Proof. Approximate as in Theorem 1 to a, b;, be, c, d, and e by sequences 


of analytic functions {a,}, {din}, {ben}, fen}, {dn}, and fe,} and to H* 
by harmonic functions H,* in such a way that if u, is the solution of 


bd (AnUnz + bi,ntny + dn) + = (b2,ntlnz + Cony + en) = 

Ox oy 
which coincides on R* with H,*, then u, is analytic on R and D(H,*) <G, 
independent of n. Define A,, B,, and C, by (5) and perform the transforma- 
tion (7). We find as in Theorem 1 that u, and 2, are of class Dz in (£, ) on = 
and satisfy 


(9) Ut = — (tng + En), Unq = Unt + Dn 
almost everywhere, where 

(10) Dn = — En = — (dn¥nt — €nXnt)- 
Also 


A,C, — B? = 1, | An|, | Bal, |Ca| S K(M, m), 
+ U?) A,U2 + 2B,U,U,+C,U? L-(U2 + U}?), 
+ E,2)dtdn N\(M,mMi, A, do)-p®, > 0,4 = (M, A), 


C(P,p)-= 


(11) 
ff (A,U2 + 2B,U,U, + C,U,?)dxdy = ff (U? + U?)dédn, 
D 


where U stands for any function of class D2, and D and A, denote correspond- 
ing regions of R and = respectively, and K, NM; and \ depend only on the 
quantities indicated, and / and L depend only on M and m. These results 
are obtained by straightforward computation, the use of Theorems 3 and 4 
of §2, and the relations of Theorem 1. 

Thus 


(12) Un, = H,(é, n) + uo,n(é, ”); 


where u,, has the significance of §4 and H,(é, 7) is the function harmonic 
on 3, and taking on the boundary values of H,*(£, n). Now, as in Theorem 1, 
we see that D(un) <L-G/l and that a subsequence {,,} converges uniformly 
on R to a function u(x, y) and {2,,} converges to a function »(x, y) uniformly 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 157 


on each closed subregion of R. Thus uw and v are of class Dz and, as in Theorem 
1, are seen to satisfy (1) almost everywhere. It is clear that (i), (ii), and (iii) 
are fulfilled. That the pair (, v) is unique follows from Theorem 1, since if 
u’ and v’ were another pair with the same boundary values, u’ —u and v’ —» 
would satisfy (1) with d=e=0 and obey the conclusions (i), (ii), and (iii) of 
Theorem 1 with u’—u=0 on R* and v'(xo, yo) —v(x0, yo) =0. That <P 
which depends only on M, m, A, do, and the maximum of | H * on R* follows 
from Theorems 1 and 2 of §3. 

Using Theorem 3, §2, and (12), we see that (iv) holds except that 
and m intervene instead of M and m. Using (11), we see that 


f f (uz + u,)dxdy < lim inf f f (tine + Uny)dxdy 
D D 


1 
lim inf f f (Antine + 2Bathnzttny + Catlay)dxdy 
D 


2 


IIA 


1 
= lim inf f (tine Man) dédn 
An 


where WN depends on the quantities indicated in the theorem except that 7 
and m intervene instead of M and m; this is true since 


f foe + f f (D2 + 


and regions D at a distance = 6 from R* are carried into regions A,, at a dis- 
tance =m/(6) from =*, where depends only on M, m, and (R; q, 7; 
a, x, p) and in such a region | H,,| and | H,,,| <2h,2-'[m(8)]-! where h,, de- 
notes the maximum of | H,*| on R*. To get rid of M@ and m, merely define 
A, B, C in terms of a, b,, be, c, and u and perform (7); all the conclusions then 
hold with M and m replacing M and m. 

THEOREM 3. Let {an}, {Bin}, {ben}, {en}, {dn}, and {en} be sequences 
of measurable functions which converge to a, b,, be, c, d, and e respectively almost 
everywhere and which satisfy 


| an|, | bial, | ben], | | dal, | en | 
— (bin + ben)? a, > 0. 


For each n, let u, and 2, be of class Dz on R and R respectively and satisfy 


(b2,nUnz + CnUny + €n) = + + d, 


almost everywhere with |un| <G on R*(M, m, and G being independent of n). 
Then 


‘ 
q 
1 


158 C. B, MORREY [January 


(i) la], [be], le], |e] $M, a>0, 

(ii) {un} and {vn} are uniformly bounded and satisfy uniform Hélder con- 
ditions on each closed subregion D of R, which depend only on M, m, R, G, and 
the distance 5(D) of D from R*, 

(iii) dy <= P[M, m, G, R, 6(D)], and 

(iv) if the subsequences {tn,}, {vn,} are chosen to converge uniformly on 
each closed subregion D of R to functions u and v, then u and 2» satisfy (ii) and 
(iii) and 

v2 = — (bou, + cu, +e), Vy = au, + du, + d, 


almost everywhere on R. The same conclusions hold if we merely assume that 
each up is of class Dz on R with | un| <G. 

Proof. (i) is obvious and (ii) and (iii) have been proved in Theorem 2. 
To prove (iv), let D be a closed subregion of R. The conclusions (i), (ii), 
and (iii) hold for D and {un,} and {v,,} converge uniformly on D to u and 2 
which are of class D, on D. Then, by Lemma 6, §1, 


ff + bow, + cuy + €)? + (vy — au, — — d)*dxdy 
D 


< lim inf f f [(onz + + Cnttny + en)? 
+ — — — dn)? |\dxdy=0. 
TueoreEM 4. Jf a, b;, be, c, d, and e satisfy hypotheses (2) and satisfy 
| 91) — y2)| S NP, = [(x2 — 21)? + (ye — 


(p(x, y) standing for a, b;, be, c, d, or e) on R, then if u(x, y) and v(x, y) con- 
stitute a solution of (1) with |u| <G, then uz, Uy, V2, and v, satisfy a uniform 
Holder condition with the same exponent d on any subregion D of R where the 
constant N’ depends on M, m, N, , R, and the distance of D from R.t 


5. Applications to the calculus of variations. In this section, we shall dis- 
cuss the differentiability properties of a function 2(x, y) which minimizes 


(1) Sf y, 2, p, gdxdy (p = 22, q = 2y) 


among all functions having the same boundary values. We shall assume that f 
is continuous together with its first and second partial derivatives for all 
values of (x, y, 2, p, g) and that the second derivatives satisfy a uniform 
Holder condition on each bounded portion of (x, y, 2, p, g) space, with 


t See E. Hopf, loc. cit. 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 159 


2 
Sovfaa — fra > 9, Sov > 0, 


everywhere. 


Lemma 1.} If 2(x, y) is a solution of (1) on a region R, which solution satisfies 
a uniform Lipschitz condition, then 


tate = f J sazay, 


for almost all rectangles D in R. 


Proof. Let {(x, y) be any function which satisfies a Lipschitz condition 
on R and which vanishes on R*. Then it follows in the usual way that 


J + Sfp + = 0. 


Now let D:a<x<b, cSy<d, be any closed rectangle in R, and define 


Then, if ¢ is any function satisfying a uniform Lipschitz condition on R and 
zero outside and on D*, we see that 


JJ + + Sufa)dudy = — ¢) + fyf,)dxdy = 0. 


Thus, it follows from a theorem of A. Haarf that 
f day fax =0 
A* 


for almost all rectangles A of D, and hence that 


fede = ody = fasay. 


Since R may be written as the sum of a denumerable number of rectangles D, 
the lemma follows. 


+ This lemma is equivalent to Haar’s equations for a minimizing function in a double integral 
problem, first stated and proved by him in the case that p and q are continuous. See A. Haar, Uber 
die V ariation der Doppelintegrale, Journal fiir die Reine und Angewandte Mathematik, vol. 149 (1919), 
pp. 1-18. 

t A. Haar, Uber das Plateausche Problem, Mathematische Annalen, vol. 97 (1926-27), pp. 124- 
158, particularly pp. 146-151. 


| 
i 


160 C. B. MORREY {January 


THEOREM 1. If 2(x, y) satisfies the hypotheses of Lemma 1, then it is con- 
tinuous together with its first and second partial derivatives and the latter satisfy 
uniform Holder conditions on each closed subregion of R. If f(x, y, 2, p, g) is 
analytic, 2(x, y) is also. 

Proof. Let 4 be a small rational number and let (x, yo) be an interior 
point of R. We define the region R, as the set of all points (x, y:) of R such 
that (1) the segment *,Sx<%,+h, y=y, if h>O or m+hsxsm, if 
h<O lies in R and (2) (x1, y:) may be joined to (%o, yo) by a curve, all of whose 
points satisfy (1). If |] <M, R, is a non-vacuous Jordan region (/ rational). 
Let D be a closed subregion of R; for |h| <M, Rx contains D. Also, if H is 
any rectangle out of a certain set of “almost all” rectangles of D, we have 


+ h, y, a(x + h, y), p(x + h, y), q(x+h, y) dy 


H* 


(2) — + h, y, h, y), p(x +h, y), + y)|dx} 


-f + h, y, + h, y), p(x + h, y), + h, y) |dxdy, 
H 


f (folx, y, »), p(x, ¥), g(x, y) |dy 
H* 
— falx, y, 2(x, y), p(x, y), g(x, ») 


(3) 
= Jf y, 2(", p(x, y), g(x, y) |dxdy, 


for all rational h with | h| </. 
Let 


1 rth 
th = — f a(E, 
hd, 


Then 
2(x + h, vy) — 2(x, y) 1 rth 
pr(x, y) = : ’ gn(x, y) q(é, y)dé, 
p(x + h, y) — p(x, y) q(x + h, y) — q(x, y) 
Pre h te h = 


almost everywhere (|/| </;). Clearly p» satisfies a Lipschitz condition on D 
for each h with 0<|h| <M, and hence is of class Ds. Also | p,| is bounded 
independently of h. 

Subtracting (3) from (2) and dividing by h, we find that 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 161 


H* 


— paz + pay + en(x, y)]dx} = 0 


on almost all rectangles H of D (for each rational h with || </). Here we 
may take 


ay(x, y) -f x + th, y, y) + t[z(x + h, y) — 2(x, y)], p(x, ¥) 
+ t[p(x + h, y) — p(x, y)], + tlg(a + h, — g(a, y) fae, 


1 1 
ints, 9) = 9) = Saat, 
0 0 
1 h 1 
en( x, y) = f faxdl + 2(x + y) 2(x, f Saat, 
0 h 0 
1 
f Sat 
0 


1 h 
y) -f fpxit + a(x + y) 
0 
1 rth 


h 


where the arguments which are not indicated in 6,, cn, dx, and e, are all the 
same as that in a,. Clearly |a,|, |cx|, |da|, and |e,| are uniformly 
bounded for | k| </, and it can easily be shown as in the proof of Lemma 2, 
§2, that 

— bP =m> 0 


for all h with |h| </,. Also from (4), it follows that, for each such h, there 
exists a function v,(x, vy) which satisfies a Lipschitz condition on R and which 
satisfies 


Yo) = 0, az = — + Cottny + Cn), = + Onttay + dh 


almost everywhere. Also, if (all rational with | {an.}, {ban}, 
{canf, {dan}, and {en,} tend to fp»[x, y, 2(x, y), p(x, y), ¥)], foe faa» 
p—fz, and faz p respectively. Hence, from Theorem 3, §4 and 
the fact that p,,, tends to p almost everywhere, it follows that p(x, y) satisfies 
a uniform Hélder condition on each closed subregion A of D. Similarly it 
may be shown that q(x, y) also satisfies the same type of condition. 

If we choose a subsequence so {2}, then we have that 


v2 = — + + [faz + Vy = + fraby + — 


almost everywhere. From Theorem 4, §4 it follows that p, and p, satisfy 


162 C. B. MORREY [January 


uniform Hdélder conditions on each subregion of R; the same statement holds 
for g: and g,. This proves the theorem. The last statement has been shown 
to hold by E. Hopff if it is known that p(x, y) and q(x, y) satisfy Hélder 
conditions which fact we proved above. 

6. Applications to quasi-linear elliptic equations. In this section, we shall 
consider the equations of the form 


(1) a(x, + 2b(x, y)Z2y + c(x, d(x, y); 
(2) a(x, y, 2, Q)Z22 + 2b(x, y, 2, p, Q)Z2y + 2, P, = Ad(x, y, 2, p,q), 


in which we assume that the functions a, b, c, and d are defined and continu- 
ous for all values of their arguments with ac—b?=1, a>0, and that these 
functions satisfy a uniform Hélder condition in each bounded portion of the 
space in which they are defined. Then it is knownf{ that there exists a solu- 
tion of (1) which is defined in the unit circle = and vanishes on >* and that 
its second derivatives satisfy a uniform Hélder condition on each closed sub- 
region of =. A more precise statement is given in Lemma 1 below. 


Lemma 1. Let 2(x, y) be the solution of (1) which vanishes on 7. let k be 
the maximum of |d|/(a+c) on 3, and let | be the maximum of (a+c) on 2. Then 


lol, lal s 1200 


on S, and p and q satisfy uniform Holder conditions on each closed subregion A 
of = which depends only on k and | and the distance of A from &*, and Zz2, Zzy, 
and 2,, satisfy uniform Holder conditions on each closed subregion A of = which 
depend only on the above and on the Holder conditions satisfied by a, b, c, and d. 

Proof. First, suppose that a, 6, c, and d are analytic on = so that z is 
analytic on ¥. It is known§ that if d,(x, y) <d(x, y) <d2(x, y) on R and if x 
and z are the corresponding solutions of (1), then z;,2222. Hence if we 
choose 


= k(1 — x? — y?)/2, = — 21, 
y) k(a + do(x, y) k(a + ¢), 


we see that on Also, since z,—2=0 and s—%=0 on &, we 
see that 


Tt Loc. cit. 

t For it is known (see Lichtenstein, loc. cit.) that the solution exists if a, b, c, and d are analytic 
and the result follows from Theorem 4, §4 by approximations. 

§ See for instance in S. Bernstein, loc. cit., second paper. 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 163 


Oz 
- § = — 
or or 
on >*. Since 02/00 =0 on 2%, it follows that 
|p|, |g] sk on 2*. 
Now ? and gq satisfy the equations 
ap, + 2bpy + cq, = d, by — = 0. 


If we set u= —p, v=q, and v=), u=q, the equations become 


a 2b d 
(3) V2 = — Uy, 
2b c d 
(4) v2 = Vy = tz, 
a a 


respectively, each of which is a system of the type treated in §4. Thus, by 
Theorem 3, §4, we see that p and g are bounded and satisfy uniform Hélder 
conditions as desired in the lemma, and from Theorem 4, §4, we see that 
Zrz, Zzy, and Zyy satisfy the desired conditions; these conditions hold whether 
a, b, c, and d are analytic or not. 

To see what the bounds are for p and gq, let R be the unit circle and let 
m, k, p and p, g, r be three equally spaced points on =* and R* respectively. 
Clearly (3; 7, x, p) and (R; p, g, r) satisfy D(L, do) conditions where 
L=4r-3-* and dy=3"?. In making the transformation (7) of §4, we see 
that the ratio of the maximum to the minimum magnification at each point 
is given by u+(u?—1)”? where 


—)u; — 


2 a 
+ — + 
c 
2 
eto 
2 


since a+c2=2. Thus the K of Theorem 2, §2 is /?—2, and hence 


ff (xe + a2 + ye + y2)dédn S — 2)(3-12. 2p) 


Since, on R, D=(d/c)y,, E= —(d/c)y; in (3) and D=(d/a)x,, E= —(d/a)x; 
in (4) (using relations (10) of §4), it follows that 


q 


164 C. B. MORREY [January 


— 2) ad? 
C(P,p)-R (x, y) on = 31/2 c 
— 2) d? 1 
ff (D? + E*)dtdn max —————-—:p”, X= 
C(P,p)-R (x, 31/2 a? (72? — 2)(L + 1)? 


in cases (3) and (4) respectively. Let us call the first yp® and the second ap”. 
Referring to Theorem 1, §3 and using the notations of Theorems 2 and 3 
and the remark, we see that 


Y 1/2 a 1/2 
| Us|, | Vil 4(1 + [2a]}-) | | Vi] $444 [2a}-) (<) 
2a 
in cases (3) and (4) respectively so that 
Y 1/2 a 1/2 
| | | 81+ [24 ]-) | uo,1|, | v0,1| 81+ [24]-) (=) 
2r 2r 


in (3) and (4) respectively, so that, finally 


1 Y 1/2 1 a 1/2 
< k <s(1+—)(* k 
| «|, s8(1+-)(*) + &, |u|, (5) 


in (3) and (4) respectively: Since | «|, |v| are merely | p|, |g| in one order or 
the other, and since 2|d|/(a+c) is between |d|/c and |d|/a, we may sub- 
stitute 4k? for d?/a? and d?/c* so that we obtain 


lel, 8[1 + — + 2k < 


since />1. 

DEFINITION 1. A function z*(x, y) is said to satisfy a three point condition 
with constant A on >* if for each plane z=ax+by+c which passes through 
three points of the curve z=2*(x, y), (x, y) on =*, we have a?+b?< A*. 

Lema 2. Let z*(x, y) satisfy a three point condition with constant A on >*, 
and let z be the solution of (1) which takes on these boundary values, d(x, y) being 
assumed to be identically zero on =. Then 


(i) S 2(x, y) S max 2*, (x, in y), and 
(x, y) on =* (x, ye=* 


(ii) p? ++ q? S A’. 
Proof. This is well known.} 


THEOREM 1. Let 2*(x, y) satisfy a three point condition with constant A on 


+ See, for instance, J. Schauder, Uber das Dirichletsche Problem im Grossen fiir nichtlineare 
elliptische Differentialgleichungen, Mathematische Zeitschrift, vol. 37 (1933), pp. 623-634. 


1938] ELLIPTIC DIFFERENTIAL EQUATIONS 165 


>* and let us assume that \=0 in equation (2). Then there exists a solution 
2(x, y) of (2) which coincides with 2* on &*. 

Proof. Let M denote the maximum of |2*| on >* and let / be the least 
upper bound of a(x, y, 2, p, g)-+c(x, y, 2, p, g) for all functions 2(x, y) which 
satisfy Lipschitz conditions and for which |z| <M, p?+q?<A2. 

Now, let zo(x, y) be the harmonic function taking on the given boundary 
values and define functions z,(x, y) for m=1 as the solutions of 


A(X, Y, Pry Yn)2nt1,22 + 26(%, Zn, Pn; Qn)Zn+1, 29 + =0 


which coincide with 2* on =*. At each stage 
SM, <A? 


and dn, bn, Cc, satisfy Hélder conditions which depend only on A, M, and J, 
where a, stands for a(x, y, Zn, Pn, Qn), for instance. Since , and q, satisfy 
the equations 


Qn—1pPnz + 20n-1P + Cn—-19ny = 0, Pay — Inz = 0 


and | p.|, |gn| <A, it follows from §4 that p, and q, satisfy uniform Hélder 
conditions on each closed subregion D of = which depend only on /, A, and 
the distance of D from =* (and not on). Thus, by Theorem 4, § 4, 2n,22, 
Zn,zy, aNd Zn,yy Satisfy similar conditions, since @,-1, 6,1, and ¢,_; satisfy uni- 
form Hélder conditions on subregions of = which depend only on the above 
quantities. Thus a subsequence {,} may be chosen so that the sequence {2,,} 
converges uniformly on = to a function z and so that the sequences {2,,,.}, 
and {Zn,,y,} converge uniformly on each closed sub- 
region of = to the corresponding derivatives of z. Clearly z is the desired solu- 
tion. 


THEOREM 2. Jf |\| is sufficiently small, the equation (2) possesses a solution 
which vanishes on =*. 


Proof. Let /(z) and k(z) denote the least upper bounds of 


a[x, y, (x, p(x, y), g(x, y)] + c[x, y, 2(x, y), p(x, y), g(x, 
d[x, y, a(x, y), p(x, y), (x, ¥)] 
a[x, y, 2(x, y), p(x, y), g(x, + c[x, y, 2(x, y), p(x, y), g(x, y)] 


respectively. For all z with z?+ p?+ 9? <a’, 1(z) k(z) <k. Now if 2(x, y) isa 
function for which 2?+ p?+ 9? <a? and for which p(x, y), g(x, y) satisfy a uni- 
form Hélder condition on each closed subregion of 2, the solution Z of 


a(x, y, 2, p, + 2b(x, y, 2, + C(%, 2, P, = Ad(x, y, 2, g) 


= 
= 
ry 


166 C. B. MORREY 


which vanishes on >* is such that Zzez, Zey, Zyy satisfy Hélder conditions on 
each closed subregion D of = which depend only on the Hélder conditions 
satisfied by a, b, c, Ad and z, p, and q, and Z, and Z, satisfy Hélder conditions 
which depend only on a, k, /, and \. Also, by Lemma 1 


(Z? + P? + < 120-31/2-18-k- |r| 
which is <a if X is small enough. The successive approximations may be 
carried through as in Theorem 1. 


THEOREM 3. Let L(G) and K(G) denote the least upper bounds of I(z) and 
k(z), respectively, for all z with 22+ p*+q? <G*. If, in addition to our hypotheses, 


we assume that 
L3(G)- KG) 
im ——————- = 0 
G 


the equation (2) possesses a solution on 2 which vanishes on X* for each value 
of X. 
Proof. For each » >0, there exists a number NV, such that 
120-31/2-L3G)-K(G) S uG 


for all G>N,. Let M, bé the least upper bound of 120L*(G) K(G) - 3”? for all 
G<N,,. If we let P, be the larger of V, and u-'- M,, we see that 


< uP, 


for all z for which z?+ + 9?< P,?. Thus, if z is such a function and Z is the 
solution of 


a(x, ¥,2, P, + 2b(x, y, 2, p, zy + C(x, ¥, 2, = Ad(x, 2, g) 
which vanishes on =*, then 
< | Py 


which is <P, if |\| <u. The remainder of the proof proceeds asin Theorems _ 
1 and 2. 


UNIVERSITY OF CALIFORNIA, 
BERKELEY, CALIF. 


A CLASS OF POLYNOMIALS* 


BY 
LEONARD CARLITZ 


1. Introduction. For an indeterminate x in the GF(p"), put 


then we define the function ¥(#) by means of 


1 k 
(1.2) Wt) = CM, 
Fy, 
where ¢ takes on the values 
t= (c; in GF(p")). 
i=0 


Then y(#) has the linearity properties 


(1.3) 
for arbitrary c in GF(p*); further from (1.2) it follows that 
(1.4) — ¥(xt) = y(t) — 

In turn (1.4) implies the general relation 

(1.5) (— 1)"¥(M2) = ou), 

where M is a polynomial in GF(p") of degree m in x, and 
(1.6) wy(u) = 


It remains to define y;(¢). We put 


k F; k F, k 
0 Li k 
for F;, as defined in (1.1), and 
= [k][k —1]--- [1], Lo = 1; 
then we have 


* Presented to the Society, December 29, 1936; received by the editors November 23, 1936. 
t For a discussion of ¥(¢) and y(t) see the Duke Mathematical Journal, vol. 1 (1935), pp. 137- 
168. 


167 


BOSTON UNIVERSITY 
COLLEGE OF LIBERAL ARTS 
LIBRARY 


168 LEONARD CARLITZ [March 


k 
(1.7) vit) = (- ; |e. Yo(t) = 
t=0 
In this paper we shall be interested first in the polynomials wy(u). Evi- 
dently (1.5) implies 


(1.8) wun(u) = wu(wn(u)), 


for arbitrary polynomials M, N. Assume next that M is primary, that is, 
the coefficient of the highest power of x occurring in M is the unit element of 
GF(p"). Then we define a class of polynomials Wy(u) related to wa(u) by 
means of 
(1.9) wm(u) = Wa(u), 

A\M 
the product extending over all (primary) polynomials A dividing M. 

As we shall see below, the polynomials Wy(u) have many properties 
analogous to those of the well-known cyclotomic polynomials.* In particular 
W u(u) is irreducible in the ring F[u], where F = F(x, p") is the field of ra- 
tional functions of x with coefficients in GF(p"). Again if P is an irreducible 
polynomial in x, the factorization of W(u) (mod P) is determined by a very 
simple rule. For example, if P/M, define e>0 as the smallest exponent such 
that 


Pe=1 (mod M), 


and put ¢(M)=er, where ¢(M) is the Euler function for polynomials M; 
then we have the factorization 


Wa(u) = frlue)fo(u) - - - (mod P), 


where each f;(u) is irreducible (mod P) and of degree e in u. Applications are 
made to the congruence wa(u) =6 (mod P). 

2. Notation; properties of w(u). It will be convenient to fix certain nota- 
tion. If GF(p") denotes a fixed Galois field of order ~*, we denote by 
R=R(x, p*) the ring of polynomials in the indeterminate x with coefficients 
in GF(p"). Similarly F = F(x, p*) denotes the field of rational functions of «x 
with coefficients in GF(p"). For an additional indeterminate u, R[u] and 
F[u]| denote rings of polynomials in « with coefficients in R and F, respec- 
tively. Elements of GF(p*) will usually be denoted by c, c;; elements of R 
(in other words, polynomials in x over the Galois field) by A, B, C, D, H, 
M, N, P, where P denotes a typical irreducible polynomial in x. The poly- 


* The cyclotomic polynomial F,,(x) is the polynomial (with leading coefficient=1) whose roots 
are the primitive mth roots of unity. 


1938] A CLASS OF POLYNOMIALS 169 


nomial M is said to be primary if the coefficient of the highest power of x 
occurring in M is the unit element of GF(p"). Typical elements of R[u] or 
F[u| will be denoted by f(x), g(u), h(u). The degree of M (for M in R) has 
the obvious meaning; the degree of f(w) means the degree in w. If the coeffi- 
cient of the highest power of w occurring in f(u) is the unit element of 
R(x, p"), is primary. 

According to the formula (1.5), war(u) is defined by means of the function 
y(t). However as the present paper is concerned only with algebraic proper- 
ties, we shall define wy(u) directly and show that all the properties of the 
polynomials follow readily from the new definition. One possibility is to take 
(1.6) as the definition, but it is perhaps more satisfactory to proceed some- 
what differently. 

For defining properties* we shall take 


= om + on, 


(2 1) = (c in GF(p")), 
= 


where M, N are arbitrary polynomials in R, and for brevity we write wy in 
place of wy(u). Then it is easy to see, to begin with, that for all M, 


(2.2) = oy” — 
thus generalizing the third equation in (2.1). Again if in that equation we 
take k=0, we have 

= — XW, = — XU; 
combining this with (2.2) we see that 
(2.3) Wem = w2(wm). 
In this equation replace M by xM; then (2.3) becomes 

= = w2{w } = wza(wm), 

since by the third equation in (2.1) 

= — = w,(wz). 
Continuing in this way we may show by an easy induction on & that 
(2.4) = wzk(wm), 
Q is defined by Qu=u?", then the third equation in (2.1) implies 


and it is easy to see that generally wy = M(Q—x)u, where M(Q—x) is the operator obtained by sub- 
stituting Q—-x for x in M. 


170 LEONARD CARLITZ {March 


thus generalizing (2.2). If now we take (2.4) together with the first two equa- 
tions in (2.1), we have at once 


(2.5) wun = wu(wy) = wy(wm), 


for arbitrary M, N. Thus we see that (1.8) follows from the new definition 
(2.1). For the sequel this property is apparently fundamental. 

It is now not difficult to derive the explicit formula (1.6) for the poly- 
nomial wy(u). Because of the linearity (with respect to ¢) of the polynomial 
y(t) it is sufficient to prove (1.6) in the case M =x”. The formula is clearly 
true for m=0. Assume it true up to and including the value m. Then by the 
third equation in (2.1), 


m _— 1 m—j pn — 1 m—j 
(2.6) j=0 i j=0 F; 


m+1 i 


= (= 1) + 


{ ap (a) + } 


But it is easily seen that (1.7) implies* 
ilxt) = + 
also Wo(x™) =x™ and F, = [k|F?_,; thus (2.6) becomes 


m+l (— 4) 
= > (x™*")ur"; 
j=0 P; 
this completes the induction, and therefore establishes (1.6). It is also evident 
from the induction that the coefficient of u®” in (1.6) is integral, that is, 
y;(M)/F,is a polynomial in x. Our results may be summed up in the following: 


THEOREM 1. The polynomial wu(u) defined by (2.1) for all M (where M 
is a polynomial in x with coefficients in GF(p")) satisfies the equation (2.5). 
The polynomial has the explicit expression (1.6) in which the coefficients of u”™ 
are polynomials in x. In particular wxy(u) is linear in uf as well as in M. 


In the next place from (2.5) it follows that wy(u) is a divisor of wy(u). 
The coefficients of «/ in the quotient are polynomials in «. This is a conse- 
quence of the following: 


THEOREM 2. If in the equation 
umtr +4 + M mtr 
= (u™+ +A,)(u + Buu +--+ + B,) 


* Duke Mathematical Journal, vol. 1 (1935), p. 141. 
Tt That is, of the form Daw. 


1938] A CLASS OF POLYNOMIALS 171 


all the coefficients M, A, B are rational functions of x in the GF(p"), then the M’s 
are polynomials if and only if all the A’s and B’s are polynomials. 


This is an analogue of a well-known theorem of Gauss; the proof need not 
be given. 

Consider now two polynomials ws;(w), wy(w). We seek the greatest com- 
mon divisor (wy, wy). Clearly if A is a common divisor of M and N, then 
wa is a common divisor of wy and wy. Let D=(N, M) the greatest common 
divisor of M and N—to make it unique assume D primary—then 


D=AM+BN, 
for properly chosen polynomials A, B. Then by the first of (2.1), 


= wam + wen, 
from which follows 


(2.7) wp = + g(uon, 
where f(w) and g(u) are polynomials in u (whose coefficients are polynomials 


in x). But (2.7) shows that any common divisor of wy and wy is necessarily a 
divisor of wp. This proves the following theorem: 


THEOREM 3. For arlitrary M, N, the greatest common divisor of wy and wy 
is determined by 


(2. 8) oD = (wm, wy), 
where D=(M, N), the greatest common divisor of M and N. 


If P is an irreducible polynomial in x, then it follows from (1.6) that 


(2.9) wp(u) = ur" (mod P), 
where & is the degree of P. Therefore by (2.5), 

wpy = (wy)? (mod P). 
If then M and ¢ are arbitrary, we have 
(2.10) = (wy)? (mod P). 


It will be convenient for a later purpose to alter slightly the notation for 
the greatest common divisor in order to indicate that we are reducing coeffi- 
cients (mod P). We shall use the symbol (f(u), g(u))p to denote the G.C.D. 
in this situation. Thus usually, 


(f(u), g(u))p F (f(u), g(u)) (mod P). 


In the present case, because of (2.7), the two symbols are equivalent, and 
we may state 


172 LEONARD CARLITZ [March 


THeEoREM 4. If MN is not a multiple of the irreducible polynomial P, then 
for D=(M, N), we have 


(wpem, wpfy)p = (wp)? (mod P). 
where P is of degree k, and e<f. 


The theorem is an immediate consequence of (2.8) and (2.10). 

Finally, we ask whether wy can have repeated factors. Since by (1.6) the 
derivative with respect to u is (—1)"M, which is independent of ~, there can 
clearly be no repeated factor. Also if the polynomial be taken (mod P), this 
indicates that for P} M there is no repeated factor. If P| M, we make use of 
(2.10). 


THEOREM 5. The polynomial wu(u) has only simple factors in F(u]. For 
PM, wa(u) has no repeated factors (mod P). For M=P*N, PIN, wm(u) 
=wy?™ (mod P), where wy has only simple factors. 


3. Definition* of W(u). For convenience assume M primary. Then sup- 
pose wy(u) exhibited as a product of (necessarily distinct) primary polyno- 
mials f; in R[u]: 


(3.1) wau(u) = fi(u)fo(u) --- fr(u), 


so that the coefficients of f;(u) are polynomials in x. Consider those f;(w) in 
the right member of (3.1) that divide no wa(u), where A is a proper divisor of 
M. The product of those f;(u) is by definition W(u), so that in particular 
Wwu(u) is primary. Since A|M implies wa(u)|wa(u), it is evident that 
W4(u)|wa(u). Again for A a proper divisor of M, it is clear from the defi- 
nition that (Wu(u), Wa(u))=1; thus ws(w) is divisible by the product 
[]Wa(u), extended over all A dividing M. On the other hand since to each 
f;(u) in the right member of (3.1) corresponds by the definition a unique 
W.(u) of which it is a divisor, it follows that 


(3.2) wy Il Wa(n). 
A|M 

By inversion we have for Wy(u) the formula 

(3.3) Wau) = 
M=AB 


where yu(B) is the Mébius function} for polynomials in R(x, p*). 
From (3.3) certain properties of W.,(w) are immediate. For example the 
degree of W is 


* Cf. Kronecker, Vorlesungen iiber Zahlentheorie, vol. 1, 1901, p. 283. 
¢ See American Journal of Mathematics, vol. 54 (1932), p. 39. 


1938] A CLASS OF POLYNOMIALS 173 


(3.4) o(M) = o(4)|B| IL a-|PP, 
M=AB P\|M 
the product extending over all irreducible divisors of M. Here | M| =p", 
where m is the degree of M, so that | M| is the degree of w(u). Comparison 
of the degree of both members of (3.2) leads to 
=| MI, 
A|M 
which is of course a direct consequence of (3.4). We remark that ¢(/) may 
be defined independently as the number of quantities in a reduced residue 
system (mod M). 
In the next place we evaluate W(0). Since Wi(u) =w:(u) =u, Wi(0) =0. 
We now assume M #1. For M =P*, P irreducible, it follows from (1.6) and 
(1.7) that Wp(0) = +P. In general (3.3) implies 


Wu(u) = 


M=AB 


so that 
i 1)*P for M = P*, 


(3.5) Wu(0) = Il (— 1)*44 = 
1 otherwise, 


M=AB 


where a, k is the degree of A, P, respectively. 

Suppose next that M and N are arbitrary, MN; then (3.2) implies that 
W u(u)Ww(u) is a divisor of waw(u). Since w(u) has no repeated factors it 
follows at once that 


(3.6) (Wu, Wy) = 1 (M # N). 
If the irreducible P} MN, we may assert slightly more: 
(3.7) (Wm, Wy)p = 1 (M # N, P{MN). 


In the general case, note that for P}M, (3.3) implies 


Wee = TT fora} TT “= TT mod P), 
M=AB M=AB M=AB 
by (2.10); and therefore (for P} M) 
(3.8) = (mod P). 
Hence we conclude that for P} MN 
(3.9) (Wem, Wrty)p = 1 (M # N), 


while 


174 LEONARD CARLITZ [March 
(3.10) (Wem, = (Wu)??* (mod P) 


fore <f. 
Consider* now the greatest common divisor of W y(wy(u)) and Wy(wa(u)), 
or briefly 


(Wu(wr), 
We take first the case (M, N) =1. Then by (3.3) and (2.5), 
(3.11) = TI = wo”. 
M=AB .M=AB AN=DE 


Now since (M, N) =1, the factorization DE may be obtained by factoring A 
and WN independently and then combining in all possible ways. Thus (3.11) 
becomes 


M=ABC N=DE N=DE M=AH 
asi 1 for H =1 
or H = 1, 
so that M=AG reduces to M=A. Therefore by (3.10) and (3.12) we have 
(3.13) = [] Wom for (M, N) = 1. 


D\|N 
Interchanging M and N, (3.13) becomes 
(3.14) Ww(om) = [] Wav. 


A\|M 


By (3.6) the greatest common divisor of W (wy) and Wy(wm) may be found 
by picking out the equal terms in the right member of (3.13) and (3.14). But 
AN =DM together with (M, N) =1 implies V| D, whence D=N and A=M. 
Thus for (M, NV) =1, 


(3.15) (Wu(wn), Wv(om)) = Wa. 
Suppose next that the irreducible P| M; then from (3.3) follows 


Wu(wr) = [] = 
M=AB 


More generally if every irreducible divisor of A is also a divisor of M, we 
have similarly 


* For the proof compare Netto, Archiv der Mathematik und Physik, (3), vol. 4 (1902), pp. 65-67. 


1938] A CLASS OF POLYNOMIALS 175 


(3.16) Wa(wa) = Wan. 
If now M =| N =| are arbitrary, put 

M = MoM, P* (PIN), Mi=[[ (PIN), 

N = No=[] (PIN), (PIM), 
so that 

(Mo, M1) = (No, Ni) = (Mi, Ni) = 1, 
while Mo, No have precisely the same irreducible divisors. Then by (3.16), 
Wu(ow) = = Wr 

and this in turn, by (3.13), implies 


Wau(ow) = I] 
DIN; 


Interchanging M and N in this equation, we have 
= I] 


A\M, 
As above the condition for common factors in the right members is DN»>M.M, 
=AM,N.M,, that is, DM,=ANi, whence D=N and A =M, and therefore 
(3.15) holds generally. The case M=N is included, for by (3.16), Wu(wa) 
=W 
THEOREM 6. For arbitrary M, N, the greatest common divisor 
(Wu(on), Wn(wm)) = Wwuy. 


4. Irreducibility of Wx(u). Let 8 be a root of Wx(u) =0 in a properly 
chosen F;> F(x, p"). If (M, A) =1, the identity (3.11) implies 


Wuwa(8)} = I] Wom(s) = 0, 

DiA 
so that w,(8) is also a root of W(u). Assume w,(8) =8. Then the polynomial 
wa (u) —u =w,4_,(u) has a root in common with W (uw), from which it follows 
that A=1 (mod M). Similarly for (M, A)=(M, B)=1, wa(8) =wz(8) im- 
plies A=B (mod M). Thus it is clear that if 8 is any root of Wy(u) =0, then 
the quantities w,(8), where A ranges over a reduced residue system (mod M), 
are distinct roots of Wy(u) =0; by calculating the degree of W .(u) it is easily 

seen that the w,(8) furnish all the roots. 
We shall now show* that W4,() is irreducible in F[u] or what amounts 


* Cf. Weber, Algebra, 2d edition, vol. 1, 1898, pp. 596-600. 


176 LEONARD CARLITZ [March 


to the same thing (by Theorems 2) in R[w]. Assume the factorization 
(4.1) Wu(u) = f(u)g(u), 


where f(z) is irreducible in R[u]. Let 6 be a root of f(u) =0 (in a field Fi > F). 
By the above paragraph, if we can show that w,(@) is also a root of f(u) =0 
for all (A, M)=1, it will follow that f(w) coincides with W(u), and there- 
fore that W() is irreducible in R[u]. Clearly it suffices to show that wp(8) 
is a root of f(u) =0 for all irreducible P not dividing M. Assume therefore 
that f(wp(8)) ~0, so that necessarily g(wp(8)) =0. Thus we see that the poly- 
nomial g(wp(u)) has a root in common with the irreducible f(u), and there- 


fore 


(4.2) f(u) | g(wr(u)). 
On the other hand, by (2.9), for P of degree k, 
wp(u) = urr* (mod P), 


which implies 
g(wr(u)) = g(ur*) = g"*(u) (mod P). 


Comparison with (4.2) shows that 

(f(u), g(u))p = (mod P), 
where h(x) is of positive degree in u. Thus (4.1) implies that W.(u) has a re- 
peated factor (mod P); since P} M, this contradicts Theorem 5. We may state 
the following: 

THEOREM 7. For arbitrary M, the polynomial W y(u) is irreducible in F [u], 
the ring of polynomials in u with coefficients in the field F(x, p”) of rational func- 
tions of x in GF(p"). 

This theorem may be extended somewhat.* For (M, N) =1, let B be a 
root of Wu(u) =0, y a root of Wy(u) =0. Then 


wun(8 + y) = wmn(8) + omn(y) = ww(wmu(8)) + wou(on(y)) = 0, 


so that B+y¥ is a root of wyw(u) =0; indeed we shall now show that it is a 
root of Wuw(u)=0. For assume Wp(8+~7) =0, where D is a proper divisor 
of MN, from which follows wp(8+~y) =0. Now D=AB, where A| M, B|N; 
we may suppose that A is a proper divisor of M. Then by (2.5), was(8+~7) =0 
implies wsn(8+~7) =0; but as above 


wan(8 + = wa(wy(8)) + wa(ww(y)) = wa(wn(8)). 


* Cf. Weber, loc. cit., pp. 600-601. 


1938] A CLASS OF POLYNOMIALS 177 


Since (V, M) =1, wy(8) is a root of Wy(u) =0 and therefore not of w4(u) =0. 
This proves 


(4.3) Wun(8 + y) = 0 (for (M, N) = 1). 


Under the hypothesis (M, NV) =1 we may choose A, B such that AM+BN 
=1. Put a=8+y7, then 


wam(a) = wam(8) + wam(y) 
= wamu(8) + — wen(y) 
= wa(wu(8)) + — wa(on(y)), 
so that we have 
(4.4) wam(a) = ¥, wpn(a) = 8B. 


Let us now assume that W.(u) factors in Fi[u], where F;=F(7) is the 
field obtained by adjoining y to F: put 


Wu(u) f(u, y)g(u, 


where f(u, v), g(u, v) are polynomials with coefficients in F. Let B be a root 
of f(u, y) =0, then by (4.4) 


f{ wen(a), wan(a)} = 0. 


But since Wyy(u) is irreducible in F[u], it follows from the first paragraph 
in this section that 


(4.5) wau(wn(a))} = 0 

for all (H, MN) =1. Now for arbitrary (D, M) =1, we may choose H so that 
H = D (mod M), H = 1 (mod WN). 

Since by (4.4), 


wpn(wn(a)) = wa(wen(a)) = wn(8) = wp(8), 
wam(wn(a)) = wy(wam(a)) = wx(y) = 7, 


we have after substitution in (4.5), 
f{wn(8), 7} 0, 


so that f(u, y) has all the roots of W4(u) =0. 


THEOREM 8. Let (M, N)=1, y a root of Wn(u) =0, Fi=F(y) the field ob- 
tained by adjoining y to F; then the polynomial W y(u) is irreducible in F,[u]. 


As an application of Theorem 7 we state the following theorems: 


178 LEONARD CARLITZ [March 


THEOREM 9. The group for the field F of the equation W y(u) =0 is abelian; 
indeed it is simply isomorphic with the group (with respect to multiplication) of 
the reduced residue system (mod M). 

THEOREM 10. Jf B is a root of Wy(u), and t is an indeterminate, then the 
group for the field F(, t) of the equation Wy(u) =t is abelian; indeed it is simply 
isomorphic with the additive group of residues (mod M). 

5. Irreducibility proofs for the case M =P’. In the case M =P’, where P, 
is irreducible, Theorem 7 may be proved very quickly in the following way. 
Assume the factorization 


Wpe(u) = f(u)g(u), 
where f(w) and g(u) are in R[u]. By (3.5) Wp(0) = +P, so that f(0)g(0) = +P. 
We may therefore suppose that f(0)=c, an element of GF(p"). Construct 
the polynomial 


(5.1) h(u) = IT flea), 


where A ranges over a reduced residue system (mod P’*). Let 8 be an arbi- 
trary root of f(u) =0. For fixed A, P}A, determine B such that AB=1+DP". 
Thus 


wp(wa(8)) = wpa(8) = 8 + wpr(8) + B + wn(wr-(B)) = B, 
flon{wa(8)}] = f(8) = 0, 
from which it follows that 
h(wa(8)) = 0, 


so that =0 is satisfied by every root of Wy(u) =0. Therefore W 4(u)| h(u), 
and W(0)|4(0). But W»(0) = +P and from (5.1) it follows at once that 
h(0) =1. This evidently proves our theorem. 

It is clear from (1.6) that except for the coefficient of the highest power of 
u, all coefficients of W p(u) are divisible by P, while the last coefficient is +P. 
Let & be the degree of P; then by (2.5) and the last sentence, we have 


wp*(1) = wp(wp()) 


wp(u) wp(u) {wp(u)} + P-o(u), 


W p(u) = 


so that except for the leading term every coefficient is a multiple of P. The 
last term (that is, the one free of ) is precisely (—1)*P. Clearly we may con- 
tinue in this way and prove that in Wp*(u) every coefficient after the first 
is divisible by P, while by (3.5) the last term is (—1)*P. Then the irreducibil- 
ity of Wp*(u) in R[u] follows as a special case of the following theorem: 


1938] A CLASS OF POLYNOMIALS 179 


THEOREM 11. in f(u) =u*+A,u*-!+ --- all the A; are divisible 
by some irreducible P, while P?}A,, then {(u) is irreducible in R{u|. 


Clearly this is an analogue of Eisenstein’s well-known criterion for irre- 
ducibility. To prove the theorem, assume the factorization 


f(u) = + My" +--+ + M,)(ut + Nyt! + N,). 
We may suppose that P| M, while P}N,. Then 
since P| Ax, P| M,, P{N,, it follows that P| M,1. Similarly from 
= + + Ni2M,, 


it follows that P| M,_2. Thus we prove that all the M; are divisible by P. 
Consider now the coefficient of u’: 

A, = + + 
Since P| A, this equation is certainly impossible. Hence f(x) is irreducible. 


6. Factorization of Wy (mod P). Assume first that P/M. As usual let k 
be the degree of P. Let e>0 be the smallest exponent such that 


(6.1) Pe=1 (mod M). 
Then e|¢(M), where ¢(M) is the Euler function for polynomials in R, and 


is evaluated by (3.4). We recall for later use that (J) is the degree of W u(u). 
To begin with, we have 


(6.2) wu(u) = Il Wa(u); 


A\M 
secondly since M| (P*—1) it follows that 
(6.3) wa (u) | wpea(u). 
Next by (2.1) and (2.10), 
(6.4) = wpe(u) — = — (mod P). 


Now since P is irreducible, the complete set of residues (mod P) form a finite 
field, which is indeed a concrete representation of the GF(p"*). Then by a 
well-known theorem, we have the identity 

(6.5) — = f(u) (mod P), 


deg fle 


the product extending over all f(u) irreducible (mod P) and of degree a di- 
visor of e. Then by (6.2), (6.3), (6.4), (6.5), it follows that Wy(u) is con- 


180 LEONARD CARLITZ . [March 


gruent (mod P) to the product of a certain number of the f(w) occurring in 
the right member of (6.5). We shall now prove that in this factorization a 
polynomial f(u) of degree <e cannot appear. For suppose 


(6.6) f(u) | War(u) (mod P), 
where f(u) is of degree s<e. By (6.5) we have also 
(6.7) f(u)| wr — u (mod P). 
Then (6.6) and (6.7) together imply 
(6.8) f(u) | (Wa(u), opta(u))p (mod P), 
since by (2.10) 

— 4 = wp*_;(u) (mod P). 
Using (3.2), it follows from (6.8) that for some A|(P*—1) 

f(u) | (Wa(u), Wa(u)) (mod P). 


Since e is the smallest exponent for which (6.1) holds, MA; and since 
PMA, we have a contradiction with (3.7). Therefore we conclude that all 
the irreducible divisors (mod P) of Wx(u) are of degree e. Comparing with 
the degree of W(u), we have the following: 


THEOREM 12. For irreducible PIM, let e>0 be the least exponent for which 
P*=1 (mod M). Then 
(6.9) Wa(u) = filu)fe(u) - - - (mod P), 
where the f;(u) are irreducible (mod P) of degree e, and er=(M), as defined by 
(3.4). 

To remove the restriction on P we use (3.8). Then we have the more gen- 
eral theorem: 


THEOREM 13. For irreducible P, le)p M=P*M,, where P} M. Let e>0 be the 
least exponent for which P*=1 (mod M;,). Then 


(6. 10) = {falu) fo(u)} (mod P), 
where the f ;(u) are irreducible (mod P) of degree e, and er=$(M;). 


As an application we consider certain congruences. We take first 
(6.11) Wu(u) =0 (mod P), 


where P/M. Since solutions occur only when W,,(u) has linear factors 
(mod P), it is clear that P must be =1 (mod M). In that case there are 
precisely ¢(M) solutions; if 8 is a particular solution, all solutions are fur- 
nished by wa(8), where A ranges over a reduced residue system (mod M). 


1938] A CLASS OF POLYNOMIALS 181 


Next consider the congruence 
(6.12) wu(u) = 0 (mod P), 


where P}M. Let D=(M, P-—1), so that (as in deriving (2.7)) D=AM 
+B(P—-—1), for properly chosen A, B. Then by (2.1) 


wp(u) = g(u)wmu(u) + h(u)wp_i(u) 


(6.13) 
= g(u)wu(u) + H(u)(w™* — (mod P). 


Thus all solutions of (6.12) are also solutions of wp(u)=0 (mod P). We may 
therefore suppose in (6.11) that P=1 (mod M). In this case we may show 
that (6.12) has | M| solutions (where as above | M| =p", m=deg M). In- 
deed if we put P—1=MD, (2.9) and (2.5) imply 


(6.14) — 4 = wp(wy(u)) (mod P), 


so that wa(u) divides u*—u (mod.P), and therefore the congruence (6.12) 
has the maximum number of solutions. Again a solution of (6.11) is also a 
solution of (6.12). Let 8 be a solution of Wx(u) =0. Then for arbitrary A we 
have 


wm (wa(8)) = wa(wu(8)) = 0 (mod P), 


so that w,(8) is a solution of w(u)=0. Assume next that w4(8)=wa(8), 
whence w,_2(8) =0. But this implies 


= 


and therefore Wu(u)|ws—s(u), so that M|A—B. Thus the | M| roots of 
(6.12) are furnished by w,(8), where A ranges over a complete residue system 
(mod M). It is clear from the above that the roots of (6.11) may be described 
as the primitive roots of (6.12). 

If as above P—1 = MD, (6.14) holds and we see that 
(6.15) — u = {wu(u) — 5} (mod P), 

where 6 ranges over the roots of wp(u) =0. Since u®™“ —u is completely factor- 
able (mod P) it follows that for fixed 6, the congruence 


(6.16) wu(u) = 6 (mod P) 


has | M| roots. If uo is a particular solution of (6.16), then m+ is also a 
solution of (6.16), where uw is any solution of the congruence wa(u)=0 
(mod P). Clearly if 6 is not a root of wp(u)=0, the congruence (6.16) has 
no solutions. This follows from 


wp { 5} = wp_;(u) — wp(6) = — u — wp(S) = — wp(8), 


+ 


182 LEONARD CARLITZ 


for all w (mod P). We may now state the following two theorems.* 


THEOREM 14. The congruence (6.12) is completely solvable if and only if 
P=1 (mod M) similarly for (6.11). If B is any root of (6.11), the general solu- 
tion of (6.11) is wa(B), where A ranges over a reduced residue system (mod M); 
the general solution of (6.12) is we(8) where B ranges over a complete residue 
system (mod M). 

THEOREM 15. Let P—1=MD. The congruence (6.16) is solvable if and only 
if 5 is a root of wy(5)=0 (mod P). If uo is a particular solution of (6.16), then 
the general solution is furnished by uo+p, where w ranges over the roots of 
wy (u) =0 (mod P). 

Finally we generalize the last theorem by removing the restriction 
M|P-—1. Let (P—1, M) =H, so that M=AH, P—1=BH. Then if (6.16) is 
assumed solvable, we have 

wp(5) = wena(u) = wp_i(wa(u)) = 0, 


so that a necessary condition is 


(6.17) wp(8) = 0 (mod P). 
Again for A:\M+B,(P —1) =H, it follows readily that 
(6.18) = wa,(8) (mod P). 


By Theorem 15, (6.17) is a sufficient condition for the solvability of (6.18). 
But if (6.18) holds, it is clear that 

wau(u) = wan(u) = waa,(6) = w,(5) — wee, (5) = 6, 
so that (6.16) is indeed satisfied. Thus (6.17) is both necessary and sufficient 
for the solvability of (6.16). Also it is evident from the above that (6.16) has 
exactly the same solutions as (6.18). We have therefore the following: 

THEOREM 16. For arbitrary M, let (M, P—1)=H, M=AH, P—1=BH, 
AA,+BB,=1. Then the congruence (6.16) is solvable if and only if (6.17) holds; 
the congruences (6.16) and (6.18) are equivalent. 

THeoreM 17. Let (M,P—1)=1. Then for arbitrary 6, the congruence (6.16) 
has a unique solution. Thus (6.16) defines a (1, 1) transformation of the residues 
(mod P); the inverse of the transformation is wa ,(6) =u (mod P), where A\M=1 
(mod P—1). 

For in this case B=P—1, and (6.17) is automatically satisfied. 


* Analogues of well known results on binomial congruences, modulo p. 


Duke UNIVERSITY, 
Duruay, N. C. 


ANALYTICITY OF EQUILIBRIUM FIGURES OF ROTATION * 


BY 
BERNARD FRIEDMAN 


INTRODUCTION 


The problem of ascertaining the possible forms of relative equilibrium of 
a homogeneous gravitating mass of liquid, when rotating about a fixed axis 
with constant angular velocity, had its origin in the investigations on the 
theory of the earth’s figure which began with Newton and MacLaurin. In 
recent times it has undergone much development especially at the hands of 
Poincaré, Liapounoff, and Lichtenstein. 

We take the axis of rotation as the axis of z and the mass-center, which 
must evidently lie on the axis, as origin. If w be the angular velocity of rota- 
tion, the component accelerations at (x, y, z) are —w*x, —w*y, —wz and the 
dynamical equations reduce to 


where {2 is the potential energy per unit mass, p the pressure, and p the den- 
sity. Hence, integratiig, we have 


p/p = 4w*(x? + y?) — 2+ const. 


At the free surface, p=constant and we have 
(1) $w?(x? + y?) -f = const., 
r PQ 


where R is the region containing the rotating mass. 

Liapounoff and Lichtensteint have proved that, at all points where the 
apparent gravity is not zero, the surface possesses continuous derivatives of 
all orders but the problem of the analyticity has so far defied solution. This 
problem is equivalent in difficulty to that of the analyticity of the solutions 
of elliptic differential equations of the second order which was solved by 
E. Hopf.t 


* Presented to the Society, September 10, 1937; received by the editors July 6, 1936, and in 
revised form, January 11, 1937. 

T Lichtenstein, Gleichgewichtsfiguren rotierender Flussigkeiten. 

¢ Mathematische Zeitschrift, vol. 34 (1931), p. 194. 


183 


1 dp 1 dp an 1 dp 

p Ox Ox p oy oy p 02 

| 


184 BERNARD FRIEDMAN [March 


In this article I use the method developed by E. Hopf in the above paper 
to prove that the surface of equilibrium figures of rotation is analytic at all 
points where the apparent gravity exists, that is the gradient of the pressure 
is not zero. 

The equation of the surface of revolution is given implicitly by equation 
(1). By a few simple transformations we generalize (1) so that it will have a 
meaning for complex values of x and y and then we differentiate partially 
with respect to x and y obtaining equations (10) of Part I. 

Knowing that a solution exists for real values of x and y we set up a se- 
quence of approximating non-monogenic functions (cf. equations (14), (15) 
of Part II) which reduce for x and y real to the known solution. We prove the 
sequence converges uniformly and that the limit is an analytic function. 

I wish to express my thanks and gratitude to Professor E. Hopf who pro- 
posed the problem and without whose assistance and encouragement it would 
not have been solved. 


I. FORMULATION OF THE PROBLEM 


We wish to prove that if R is any 3-dimensional region, B its boundary, 
which satisfies the following equation: 


(1) f We _ Fp) =0 forall Pon B, 
r PQ 

where F(P) is an analytic function of P, dV the element of volume and PQ 
the distance from P to Q, then the surface formed by B is analytic at all 
points P’ where the gradient of (1) is not zero. It will be assumed that sur- 
faces satisfying this equation possess a sufficient number of derivatives. This 
has been proved by Lichtenstein.* 

Take any such point P’ as (0, 0, zo), zo >0, and let the z-axis be normal to 
B at P’. Let the equation of the surface be z=2(x, y). Then since 2(x, y) has 
partial derivatives of all orders we have z/ (0, 0) =z, (0, 0) =0. 

Since the gradient of (1) is not zero 


<|f - #0, P= (0,0,%). 
0z R 

Because of the continuity of the surface there then exist positive numbers r, rz 

such that for x?+-y?<r? we have z(x, y) >0, |2(x, y) —20| and 


We 
(2) ~| f PO F(P)|>4d for P = (x, y, 2), 


* Loc. cit. 


1938} EQUILIBRIUM FIGURES OF ROTATION 185 


where d is some positive constant. 
Let a<r;. Denote the semi-cylinder £?+7?<a?, £>0 by (a). Then (1) can 
be written as 


dV, dV. 
3 —2 — F(P = 0, 
(3) +f 0= (2,5) 


The second integral, call it G,(x, y, 2), is the potential at P of the region 
R—(a)-R. G(x, y, z) is known to be an analytic function of x, y, z as long as P 
is not in R—(a)-R that is if x?+?<a?, |z—20| <n. 

Consider* 


f 
(a)-R PQ \emz(z,y) 
dt 
Oz a 0 [(é x)? + (n y)? + (¢ z)?]4/2 
[ot + 2a, 


where a denotes the circle ¢?-++-?<a?, p?=(—x)?+(n—~)? and so 


<f 


By Schmidt’s inequality, we have, however 


d ‘d' 1/2 
(4) ff = 
a p Tv 


so that 
f 
(a)-R PQ 


Taking 47a <2d and using (2) and (3) we have 


< 4ra. 


> 2d for x«?+ y? < a’, |z — 20| < re. 


We can now write (1) as follows: 


* Note that z means the third independent coordinate of the set (x, y, z) but 2(x, y) refers to the 
equation of the surface as a function of x, y. 
¢ Sternberg, Potentialtheorie, Sammlung Gischen, p. 99, equation (3). 


i 

— 


186 BERNARD FRIEDMAN [March 


ff +G.—-F=0 
©) 


The integral can be split up into 


f 2(z,y) dg 
did f + | 
a 0 2(z,y) [p? + (§ — 2(z, 


After we make the change of variable ¢’ = z(x, y) —¢ and integrate, it becomes 


f 2(x, +4 (= y) 2(é, 


where f(u) =log (u+(u?+1)"/*) is regular for u =0. 
Call the integral of the first term g,(x, y, z), so that 


2) = ff [log [z + (2? + p*)'/2] — log p]dtdn. 


2) is an analytic function of x, y, z for |z—zo| <r2, x?+-y? <a? because 
z+(z?+p?)'/?>0 for all £, 7 and 


a? 
ff » adn + (ogc — }). 
Now by 


dédn dédn 
gals, z)= Si (2 + re <ff = — 2za, 


so that 


|= 
£a\%, ¥, 2 


Call g.+G.—F =H.,(x, y, z). Then H, is an analytic function of x, y, z for 
<a’, |z—z0| and 


>d. 


(x, y, 2) 


oH, 
(6) | 


Equation (5) can now be written 


(7) f J (= didn + H.(x, y, 2) = 0. 


To put (7) in a more easily handled form we differentiate it with respect to x 
and y. We have 


— 


1938] EQUILIBRIUM FIGURES OF ROTATION 187 


dH. 


p p 
(8) 
x — OH, 
p 0 
and a similar equation for z, . 
Let 
0H, 0H, OH, 
(9) ax ay 


a2(x,y) = a(x, y), (x, y) = 22(x, 9). 


Call the integral in (8) F(x, y) and the corresponding integral in the equation 
for 2) , y). 
Then our equations become, omitting for convenience the subscript a, 


(10) a(x, y) L(x, 2(x, y)) + Fi(x, = M(x, 2(x, y)); 
Zo(x, y) L(x, 2(x, y)) + F.(x, y) N(x, a(x, y)); 
(11) 2(x, y) — 2(a cos ¢, a sin g) = J + 22(x’, y’)dy’, 


21(0, 0) = 22(0, 0) = 0. 

We wish now to consider (10) for complex values of x and y. To do this 
we shall extend the meaning of our integrals so that it will take account of 
complex values of x and y. Then let 2 (x, y), 21° (x, y), z2(x, y) be any con- 
tinuous functions of the complex variables x, y which reduce for real x and 
real y to 2(x, y), 2:(x, y), z2(”, y) the solutions of (10). We then set up by the 
method of successive approximations Z(x, y), Z:(x, y), Z2(«, y) which satisfy 
the extended form of (10) also for complex values of x, y and reduce, for real 


x and real y, to 2(x, y), 2:(x, y), 22(x, y). 
Let x=x'+ix’’, y=y’+iy’’. We shall restrict « and y to the region R,, 


where R, is defined as follows: 
(12) + y'2 < a’, < — + y’2) 1/2], 
Note that R, is convex, for if (1, y:) and (x2, ye) are in R, so is (x, y) =¢(x1, 1) 
+(1—12)(x2, since 
(x’"2 + 1/2 + y’2)1/2 < t(x{'? + yi (1 + 
+ + + (1 — (ae? + 
S yta+ y(1 — Aa = ya. 


We now extend the meaning of the integrals in (10). In 


188 BERNARD FRIEDMAN [March 
(13) Z(x, y) — Z(a cos ¢, a sin ¢) = f Z1(x’, y’)dx’ + Z(x’, y’)dy’ 

let x’ =r'(x—a cos +a cos ¢, y’=7'(y—a sin +a sin ¢, <1. Then 
(14) Z(x, vy) — Z(a cos ¢, a sin ¢) = file — acos ¢)Z,;+(y — asin ¢)Z2]dr’ 
which has a meaning for complex x and y. In 


9) — 9) = f + Za(or, 
let 
a=i+7(x — 8), B=nt+7(y— 7), 0s7381. 


Then 
(15) 9) — Z(G 9) = J B)(x — &) + 6)(y — 2) |r. 


In F,(zx, y) let 
§=x+t(acos¢d — x), 0si<i, 


(16) 
n=y+i(asing — y), 0S ¢< 


The substitution is legitimate since £?+-?<a?. Then 
p = t[(a cos @ — x)? + (asin d — y)?]!/2 = 16), say. 
Using (15) and (16) we have 
1 
Z(x, 9) = 8)(a cos — x) + sin — y) lar 
0 


= IZ 4(x, t, ¢), say, 
and then 


— xcos¢@ — ysin 


(17) 


= F(x, y,%1,24), say. 


Similarly F.(x, y) =F (x, y, 22, 24). Note that (14), (15), (18), and (19) have 
meaning for complex x and y. Our equations can now be written as follows: 


1938] EQUILIBRIUM FIGURES OF ROTATION 189 


Z,(x, y) L(x, Z(x, y)) + F(x, Zi(x, y), Za(x, y)) M(x, Z(x, y)), 
Z2(x, y) L(x, Z(x, y)) + F(x, y, Zo(x, y), Za(x, y)) N(x, Z(x, y)); 
Z(x, y) — Z(a cos ¢, a sin ¢) 


= f [(x — acos ¢)Z,(x’, y’) + (y — asin ¢)Z2(x’, y’) ]dr’. 


These are three equations for the unknowns Z;,(x, y), Z2(x, y), and Z(x, y). 
Actually by substituting the value of Z from the third equation into the first 
two, we have two equations in the unknowns Z;,(x, y) and Z,(x, y). These 
equations will be considered in the following region: 


in R,, | | z2| 


where c is a constant to be determined later 

We shall now prove that H(x, y, Z(x, y)) is analytic for x, y in R, and 
also obtain bounds for its first and second derivatives when a is small. Now 
in H,=g.+G.—F, g, and F are obviously analytic in R,. G, will be analytic 
if it can be shown that («—£)?+(y—7)?+(z(x, y) —£)?0, where x, y is in R, 
and &, 7, ¢ is in R—(a)-R. This amounts to showing that 


(x — acos ¢)? + (y — asin ¢)? + (Z(x, y) — Z(a cos ¢, a sin ¢))? ¥ 0. 
Let Now 
| (x — acos ¢)? + (y — asin ¢)?| > real part [(x — acos ¢)? + (y — asin¢)?] 
2 (x’ — acos + (y’ — asin — y*(a — 1)? 
because of (12). Also from (14) we have 
| Z(x, y) — Z(a cos ¢, a sin ¢) | < c| x— a cos $| +c| yr a sin ¢| 
and 
| Z(x, y) — Z(a cos ¢, a sin ¢) |? c*{ [(x’ — a cos + — 
+ [(y’ — asin + °(a — 
where ¢ is max [| Z,(x, y)|, |Z2(x, y)| ] for x, y in R,; so that 
| (x — acos ¢)? + (y — asin ¢)?| > | Z(x, y) — Z(a cos ¢, a sin ¢) |? 
if 
(x’ — a cos ¢)* + (y’ — asin ¢)? — y°(a — 1)? 
> — a cos 6)? + y%(a — + [(y’ — sin g)? + 
But if y <3, then 


| 
| 


190 BERNARD FRIEDMAN [March 


(x’—a cos $)?+(y’—a sin 

{ [(x’—a cos [(y’—a sin 2 

1 3y?(a—r)? 1 
{[(e’—acos [(y’—a sin pap 


Therefore if c?<%, G, will be analytic and so will H., La, Ma, and Na. 
Using the above inequality we have 


| (x — acos ¢)? + (y — asin ¢)? + [Z(x, y) — Z(a cos ¢, a sin ¢) |?| 
= — c*){ [(x’ — acosg)? + y*%(a — + [(y’ — asin g)? + 
> (4 — c*)[(x’ — acos¢)? + (y’ — asing)’]. 
Similarly for a* <r? we have 
| (x — + (y — = — 8)? + — 0)? — 9)? 
and 
\Z(x, y) — n) | < | Z(x, y) — Z(acos ¢, asin ¢) | 
+ |Z(a cos ¢, a sin ¢) — Z(E, 1) | 
<c[| x —acos¢| +|acos ¢ — +] y—asing| +| casing — 
4c[| 
and proceeding as above we obtain the similar equality: 
on | (x — (y — 0)? + Zz, ») — ZG, 0))?| 
= — 16c*) [(x’ — + (y’ — 
To obtain bounds for L, M, N, L/, M/, Ni we proceed as follows: 


We have* 
~ =ff cos x) 
—(a)-B 


and similar expressions for the derivatives with respect to y and z where 
—(a)B is the boundary of R—(a)R and dwg is the surface element. Since 
2(é, ) has derivatives of all orders there exists r4<7r3 such that for £?+n?<r? 


cos > > 


where is normal to B at &, n, 2(&, ). Let B’ denote the part of the surface 
for which £?+7?>r?. Then using (18) 


* Ibid., p. 124, equation 35. 


1938] EQUILIBRIUM FIGURES OF ROTATION 191 


dug 


+— — 16c? ’ 


where dw’ = dw cos (n, 2) is the projection of dw on the (x, y)-plane. 
The integral over B’ is a constant independent of a, while the second in- 
tegral is by Schmidt’s inequality less than 27r;. Therefore, 


(19) < hi, x, yin R, 


Ox 


and similar inequalities exist for the derivatives with respect to y and Z, where 
k, is a constant independent of a. 
For the second derivatives we have 


a 
a°G f f 
= COS (%, 2 


and similar expressions for 0°G/dz0y, 0°G/dz*. As before 


dwg 
+ — f § — 16c?) 
C1 — x’)? + — 9’)? 


1 


dwe 


dw 
020% 


since 


— 16c? 
~ x’)? + (n—- y’)? 
from (18). The first integral is again a constant independent of a. The second 
integral is of the order of log [(a cos @—x’)?+(asin d—y’)*]. Therefore 


Ox r r? 


(20) 


| < kz log [(a cos ¢ — x’)? + (asin g — y’)*], 
20x 


where is independent of a. A similar result holds for the second derivatives 
with respect to yz and 2?. 
Before going any further we must find a better bound for M and NV. When 


—— 
| 


192 BERNARD FRIEDMAN [March 


a=0, H, reduces to the left side of (1) and since the z-axis is normal to the 
surface at (0, 0, 2) we have 


L,(0, 0, zo.) = 4d, M,(0, 0, zo) = 0. 
Using (7), (19), and Schmidt’s inequality 
| Lo(x, y)) — Lax, y, 2(x, y))| < haa, 
with similar inequalities for M and N. Since L,, M,, and N, are analytic we 
can therefore choose a and y so small that 


dc 
d/2< |Lal, | Mal, | Na| for x, yin Ry. 


Since F and g, are analytic they are bounded and using (20) we shall have 
| Li | < k log [(a cos @ — x’)? + (a sing — y’)*], 
and the same bound for M/ and WN. 
II. ProoF or ANALYTICITY 
We have to consider the equations 


Zi (x, y) L(x, Z) + F(x, Z1, Zs) M(x, Z), 


(1) Z2(x, y) L(x, Z) + F(x, 22, Z:) N(x, Z), 


1 
Z(x, y) — Z(a cos ¢, asin = f [(acos@ — x)Zi + (asin — y)Z2]dr’, 
0 


where L, M, N are analytic functions of x, y, z for x, yin R,, |x|, |z| <c 
and satisfy the following inequalities in that region: 


k>|L|>d; M| In| <%; 
| | |, | Li | < k log [(a cos ¢ — x’)? + (a sing — y’)?]. 


F is defined as follows: 


1 
F(x, y,Z1,Z4) = f f 
0 0 


acos¢— x 


where 


(4) #? = (a cos — x)? + (asin g — y)*, 


a—xcos¢@— ysing 


1938] EQUILIBRIUM FIGURES OF ROTATION 193 


and 


(S) Za(x, y,t,¢) = f [Z:(a, 8)(a cos — x) + Z2(a, 8)(a sin — y)|dr, 


where 
(6) a=§+7(x — &), B=n+7(y— 1), 
and 
2+ t(acos¢— x), 0si<i, 
(7) 


If we let x =x’+ix’’, y=y’+iy’’ when x, y is in R,, that is, 
then &=0 if and only if x =a cos ¢, y=a sin ¢. For 
= [ae — x — iy|[ae-i* — x + iy]. 
Now if =0, one of the brackets is zero. Assume ae** = x+y. Then 
(9) =|«+ iy| 
or 
(10) = (x! — y")? + — = + + y’— 2’ y"). 
By Cauchy’s inequality, 
so that 
+ + — + 
= [ya + (1 — + 


Now this bracket is less than a? if x’?+y’?<a?. So (10) can not be true and 

+0. Hence for 6=0 we must have x’?+-y’2=a? which implies x’’ =y’’ =0 

and x =a cos ¢, y=a sin ¢. Therefore 

acosg@ — x 
® 


(11) 


are bounded. By taking y sufficiently small the upper bound of these can be 
made less than three so that 


| 


194 BERNARD FRIEDMAN [March 


Then 
a— xcos¢@— ysing | _ cos ¢(a cos @ — x) + sin g(a sind — y) | <5, 
® ® 
and calling Z;/@=Z; we have 
— asin g — 


< 


1 
| dr 
0 


(13) 


We now define successive approximations 2, 2°”) , ze”) to Z(x, y), Z,(x, y), 


Z2(x, y) as follows: 


_ F(a, Y, 21 ) _ MC, y, 
L(x,y,2) L(x, 9, on 
(14) ‘ 
(v+1) _ F(x, Y, 22 24 ) N (x, y; 2 
Z2 = 


L(x, y,2) y, 2) 


1 
(15) («, y) — 2(acos ¢, asin = f [(a cos — x)zy + (asing — y)zy |dr’, 
0 


where 
(16) -f (a, B)(a cos — x) + "(a sin @ — y) |dr. 


The first approximations 2; (*, y) and (x, y) are any continuous func- 
tions of the complex variables x, y which reduce, when «x, y are real, to 2:(x, y) 
and 2(«, y), the solutions of (1) in the real domain. By taking a and y small 
enough we shall have 


» (0) (0) 
(17) ai |, Ze | <c for wx, yin R,. 


Then from (15) and (16) we have 
(18) | 2 (x, y) — 2(a cos ¢, asin ¢) | < c[| acos¢ — x | + | asing — y|] <= 3ac 


and | 25°| <6c. 
Assume that 


(v) 
’ Ze 


<c for «x, yin R,. 


(19) 21 
Then from (15) and (16) we have 


1938] EQUILIBRIUM FIGURES OF ROTATION 195 


(20) | s(x, y) — 2(a cos ¢, a sin ¢) | < 3ac, 
(21) | < 6c. 
But from (14) and (2) 


(v+1) 1 @) 


Now 
0 0 


a—xcos@— ysing 
® 

Take 3807a <d(1 —36c?)'/2 so that 

(22) | F|'< 4cd, 


and then |2:°°+| <4c+4c=c. Therefore (19) holds for all v. 
We define an operator A’ as follows: 


-a do S 2wa(1 — + 90c). 


of 
(»—1) 
A f(x, y, i) = f(x, y,2i ) — f(x, ) -f » 02; 


dz;. 


The integral is to be taken along the straight line joining z,“"—” to 2,” . Notice 
that 


(23) A’ f(x, Zi)g(x, = g(x, 2; )A' f(x, Zi) + f(x, 


By 2; we mean the vth approximation to 2;. 
Let 


(» 
i 


—1) v 


max [| A’z,|, | A’z2|] =o, for x, yin R. 


From (14), using (23), we have 


A’ F + A’ M A’F + F( (v—1) 1 
(24) L L L(x, y, : L 
A’M + 1 
L(x, y, 2) L 
Now 
acos¢@— x 
A’F -f (2:2 + 1)-1/2 + + 1)-1/? 
(25) 


a—xcos@— ysing 


$ 


196 BERNARD FRIEDMAN [March 


But 


—1/2 (vy) —1/2 


| < | 1) (v1), 2 


(26) A’zi(zs + 1) ) +1) 


and using (21), 


—1/2| — 5 2 —3/2 
(27) 
< 6c(1 — 36¢) — |. 


Hence in (26) 


2,—3/2 (v) (v—1) | 


| + < 6c (1 — 36c ) — 25 


28 
(28) + (1 — 36c%)-1/2| Avz, |. 


Considering the second term in (25), we have 
(¥) 


| Arzs(232 + 1)-1/2 | = (23 + 1)—*/2dz, 


(v—1) 


(29) 
2, —3/2 (») (v—1) 


<(1-—36¢) |. 
From (16) we have, using (11), 
(30) | Avzs| < 60,. 
Hence in (25) using (12), (26), (28), (29), and (30), we have 


(31) | AF | S 2wa(1 — — 36c2)-! + 5 + 90]o, = ad,o,, say. 


We also have 


2(¥) 


| < Li L-*dz 
(32) 
< kd-* log [(a cos ¢ — x’)? + (a sing — y’)?]- | A’s|, 
| f Mi dz 
(33) 
< k log [(a cos ¢ — x’)? + (asin @ — y’)?]- | Ars]. 


But from (15) 
(34) | Ars 


acos@— x| +|asing — y]], 
and, using (31), (32), (33), (34), (21), and (2) in (24), we have 
(35) | A’tig, | < ad—(d; + 3ck log a+ k log a + k*d“ log a)o,. 


< o,| 


1938] EQUILIBRIUM FIGURES OF ROTATION 197 


Taking a so small that 
ad—'(d, + 3ck log a+ k log a + k?d— log a) < 3, 


we have |A’+1z,| <3, and a similar expression for |A’+'z,|. Therefore 
0,41 and o, approaches zero since a, < (4)’ao. From (34) |A’s| approaches 
zero and hence 2°”), 2°"), ze”) approach uniformly limit functions Z(x, y), 
Z,(x, Z,(x, y). 

We now wish to show that Z(x, y) is an analytic function of x and y. 
Assume that 2, (x, y) and 2, (x, y) have continuous first partial derivatives 
with respect to x’, x’’ and y’, y’’, e.g., by assuming Z, (x, y) =2:(x’, y’); 
Z (x, y) =22(x’, y’). Then 2: and z will also have partial derivatives of 
the first order. Mathematical induction shows that the same is true for 
2”) 2”) ‘ 2”, and 2°”). 

Consider the operators 


Applying V: to (15) we have 

(36) Vite [Vir — x) + Vize (a sing — y)]dr. 

0 

Applying it to (14), we have 

(37) = f [vViz1 (a cos — x) + Vite (a sind — y) Jar’, 
since x and y are analytic. Also 

(v+1) _ F(a, 


@) 
Li (x, y,2 )Viz 


(38) 


acos¢ — (») 


= + 17" ad 


a—xcos¢— ysing 


— L-*(LM; — 


Let max [| Viz 


, | Viz” || =a, for x, yin R,, then in (36) we have 


Vi25 


< 6a,. 


it 
ff 
if 

(») 
an 125 
| 


198 BERNARD FRIEDMAN 


In (37) we have 
| Viz 


Using (21) and (2), we see that the first term on the right of (38) is less than 
4ck log [(a cos ¢—x’)?+(a sin ¢—y’)?]-(|a cos ¢—x| +|asin d—y|). Using 
(21), (11), and (12), we have that the integral in (38) is less than 


a(1 — + 108ac?(1 — 36c?)-'! + = deaa,, say. 


<a,[|acos¢ — x] +]|asing — y|]. 


The third term is less than k*d-*a log a-a,. 
Hence in (38) 


| “m |< [3ck a log a + dea + k*d-*a log ala,. 


Choose a so that 
3cka log a + doa + k*d-*a log a < 3. 


Then |Vizi"+?| and so that a, approaches zero. 
Since |V,z| <4aa,, Viz also approaches zero uniformly in x and y and 
therefore V:Z(x, y) exists and equals zero. This proves that Z(x, y) is analytic 
in x. A similar proof holds for the analyticity in y. 

The proof may seem irregular because we have varied our choice of a. 
But it can be seen that all we have required of it is that it satisfies the follow- 
ing inequalities: 


380ra < d(1 — 36c2)-1/2, 
ad~'(d, + 3ck log a + k log a + k?d— log a) < 3, 
icka log a+ dea + k*d-*a log a < 3, 


which can be done since the constants c, k, and d were proved to be independ- 
ent of a. 


MASSACHUSETTS INSTITUTE OF TECHNOLOGY 
CAMBRIDGE, Mass. 


= 
= 


GENERALIZATIONS OF THE GAUSS LAW 
OF THE SPHERICAL MEAN* 


BY 
HILLEL PORITSKY 


1. Introduction. The nature of the generalizations of the Gauss law of the 
spherical mean considered in this paper is illustrated by the following theorem 


(§3): 
If in three-space the function u satisfies the differential equation 
(1) — = 0, = const. 
on and within a three-dimensional sphere S of radius r, then 
(2) A(u) = uo sinh , 


where A(u) is the average or arithmetic mean of u over S, that is 


(3) A(u) = fuas/ fas, 


while uo is the value of u at the center of S. 


A similar result holds for any number of dimensions m. Thus for »=2, 
if « is a two-dimensional solution of (1), 


(4) A(u) = 


where A(z) is the mean of u over S, a circle of radius r. The general case of n 
dimensions is given by 


2:m 2-4-n(n + 2) 
We shall denote the bracket in (5) by ¢,(r, A): 
Ww. 
2ko. odn(r,) = 2k-m(m + 2)--- (w+ 2k — 2). 
thror 
(17), unction is expressible in terms of Bessel functions of order n/2—1 as 
Ss: 
(20. A) = . 


* Presented to the Society, December 27, 1928; received by the editors March 9, 1937. 
199 


200 HILLEL PORITSKY [March 


The familiar Gauss law of the mean for harmonic functions in » dimen- 
sions in obtained by putting \=0 in (1) and (5). 
Similar laws are derived below for solutions of 


(8) (v? — = 0 
(§3); for these, it is shown that 


A(u) = ) + (V2 — A)ulo a8 
(9) 
(Vv? — A)? d) 
(p — 1)! art 


where the subscripts 0 following the brackets indicate evaluation at the center 
of the sphere. Alternative forms for the mean, say in terms of Bessel func- 
tions, are obtained from the relations 


d) 


(10) 
= [r?*/2'n(m + 2) --- (m+ 2k — 2) d)- 
Again the case \=0 is of special interest; now (8) becomes the repeated 
Laplace equation: 


(11) vu = 0, 


whose solutions are sometimes known as “f-harmonic” functions, while (9) 
reduces to 


A(u) = uo + (V2u)or?/2n + - - 
+ (2p — 2)n(n + 2) (n+ 2p —4), 


so that A(u) is a polynomial in r’. 
The most general extension of these laws of the mean considered in this 
paper is for solutions of the differential equation 
Pp 


(13) = 0 (cp # 0), 


i=0 


where c; are constants. In this case, the following law of the mean holds : 


A(u) Ar) Gn(7, Ap) 


(12) 


(14) Xp = 0, 


No 1 1 
(| 


1938] LAW OF THE SPHERICAL MEAN 201 


provided that the operator on the left of (13) factors symbolically thus: 


Dp 


(15) = cy > (v? — 


i=0 i=1 
where no two X; are alike. If, on the other hand, in the factorization (15) there 
are repeated roots, then corresponding to each root of multiplicity m, m 
columns of (14) past the first one are to be replaced by the \-derivatives of 
the elements in (14) of order 0, 1, - - - , m—1. Thus, if 


(15’) = — — Ao)™ (V2 — 

i=0 
where m,+m2+ --- +m,=p and );#); for ij, then the elements of the 
first row in (14) from the second one on are replaced by 


a. 
(16) 
d) 
de), Perth b 


one’ 
and similarly for the elements of the other rows. 
An interesting application of (5) occurs in establishing the expansion 


(17) A(u) = > (V2*u)or?*/2-4--- 2k-n(n + 2)--- (w+ 2k — 2) 
k=0 


of the mean A(z) of an arbitrary analytic function u in powers of the radius r 
(§4). This series will be recognized as a series whose first p terms agree with 
the right-hand side of (12); it can also be given the symbolic form 


(18) V")uo. 


Utilizing (17), the following illuminating interpretation is derived for 
Vu at a point O: 


(19) = + 2)--- (n+ 2k — 2) 


(2k)! 
where the last factor denotes the mean of the directional derivatives of order 
2k of the function ~ in all directions through O, averaged over the solid angle 
through O (§4). A further interpretation for V?, V‘, - - - , also derived from 
(17), is given by 
(V2u)o = 2n lim [A (u) — uo|/r?, 
r0 


(20) A,(u) — — uo |r? 


(V4u)o = 2-4-n(m + 2) lim lim : 


7,0 ro! 


ii 
Pp 
/ 1 
. 


202 HILLEL PORITSKY {March 


where A,(u), A2(w) are the means over spheres of radii 7;, 72, and the repeated 
limit is obtained by letting 7; approach zero first. The interpretations (19) 
and (20) are valid not merely for analytic functions u, but each one also for 
functions u possessing a sufficient number of continuous derivatives. 

As a further application of (17) are considered in §5 the functional rela- 
tions existing in certain cases between the means Ai(u), Ao(u) of u over 
“subspheres” lying in two mutually totally perpendicular flats of m, n—m 
dimensions (§5, equations (64), (70), (71), (74), (75), (77), (79)). These in- 
clude a theorem of Asgeirsson and some theorems of Bateman. The functional 
relations of §5 are utilized in $6 for inverting the averaging operation A, under 
certain assumptions regarding the function wu. 

One feature that is common to the laws of the mean (2), (4), (9), (14), 
as well as the indicated modification of (14), is that in every case A(x) is 
linearly dependent upon # functions f(r), - - - , f,(7), which depend on the 
radius r but are independent of the center O, while the coefficients of de- 
pendence, C;, are independent of the radius r but do depend upon the position 
of the center O. Thus, 


(21) A(u) = Cifi(r) +--+ +Cpfr(7), 


or more precisely, 
(21’) A(u)} + +C,O)f>(7). 


This property of solutions of these differential equations actually completely 
characterizes them, as is shown by the following converse (§7) : 


CONVERSE THEOREM. Let there be given a function u over a region R and of 
class C°») there. Let 


(22) flr), 5 Solr) 


be p linearly independent functions of a variable r, of class C?” in r for O<r<p 
and whose odd derivatives f;?*+»(r), R=0, -- - , p—1, vanish at r=0.* If (21’) 
holds for a sphere of radius r and center at O for any position of the center O in 
R and sufficiently small radius r, then u must satisfy an equation of the form (13) 
for proper constants c;, while the functions f(r) must reduce to a set of p solutions 
of the ordinary differential equation 


(23) > = 0 


r dr 


* This apparent restriction on f;(r) is actually satisfied by A (1) as well as by the functions f;(r) 
of the preceding laws of the mean. 


1938] LAW OF THE SPHERICAL MEAN’ 203 


which are analytic at r=0. The latter solutions are exhibited above in (16) for 
the general case in which the operator on the left of (13) factors as in (15’). 


The differential equation (23) results from (13) if it is supposed that x 
depends only upon r, the distance from a fixed point. We shall refer to such 
solutions as “symmetric” solutions. 

A particularly interesting special case of this converse is given by the 
following result: 

If, for a function u of class C?”) in R 


(24) A(u) = polynomial of degree p in r?, 


for any position of the center O in R and sufficiently small radius 7, then « 
satisfies (11) (or is p-harmonic). 

The above laws of the mean and their converse involve symmetric solu- 
tions of (13) (or solutions of (23)) which are analytic at r=0. Symmetric 
solutions of (13) which are not analytic at r=0 occur when the function u 
under consideration satisfies the proper differential equation not in the com- 
plete interior of a sphere S but only over a spherical shell R,,, between two 
concentric spheres S,, S, of radii a, b; O>a=r=b (§3). As an example, in a 
three-space, if u is harmonic in such a spherical shell R,.,, then for the various 
spheres concentric with S., S, the mean is given by 


(25) A(u) = C, + C:/r, 


where C,, C2 are constants. Even this simple result does not appear to be as 
familiar as its simplicity warrants. 

The method of proof used in deducing the various laws of the mean is, 
itself, of some interest, particularly in view of its elegance and simplicity. 
It consists in regarding the operation which replaces u over each of a con- 
centric family of spheres by its spherical mean A(z) over that surface, as a 
linear functional operation A, and utilizing the permutability of the operator 
A with the operator V?. This operator A is discussed in a preceding paper* 
to which we shall refer briefly as I, where its above mentioned property is 
proved (I, Theorem 1). We shall suppose that the reader has familiarized 
himself with I, at least with its introductory §1, with the definition of A and 
of the other operators contained in it, and with the statement of the theorems. 
The details of the proof of I, however, are not essential for a thorough under- 
standing of the present paper. 

Some of the above results, a search of the literature has revealed, are not 


* On operations permutable with the Laplacian, American Journal of Mathematics, vol. 45 (1932), 
p. 667. 


204 HILLEL PORITSKY {March 


new. Thus, equations (2) and (4) have been given by H. Weber,* while a 
result equivalent to (17) for »=3 has been obtained by W. D. Niven.t These 
results were, nevertheless, included, because it is believed that the present 
proofs have decided advantages both in regard to unity and simplicity. 

Many of the results of this paper can be generalized by passing from the 
operator A to other operators considered in I, which are likewise permutable 
with V*. These generalizations are considered briefly in §9. A formula estab- 
lished in §8, due to Hobson, forms a natural transition to these generaliza- 
tions. At the end of §9 are considered extensions to certain discontinuous 
functions. 

Many interesting applications of the results of this paper can be made. 
Thus the integral representations of Jo(r): 


1 
J((r) -f / a -f 
0 -1 
result if one starts with the simple solution of (1) in the plane for 
\= —1:u=e'*'!, and averages it over circles with center at the origin. 
Again, the Laplace integral for the Legendre polynomial: 


P,(2) = f [z + cos — 
0 


is obtained by starting with the elementary harmonic function (%,+7%2)" in 
three-space and averaging it over circles having the x-axis as their axis. 
Similarly, the expansion 


(1 — 2r cos 0 + r?)-1/2 = (cos 6) 
may be proved by averaging over the above circles the geometric series 
1 
1 — (41 + ixe) 
However, a systematic application of the results of this paper is reserved for 


a forthcoming paper entitled On integral representation of Bessel and related 
functions. 

Another forthcoming paper somewhat related to the present one is en- 
titled Green’s formulas for analytic functions. In this paper is proved the 
analyticity of solutions of (13) as well as their expansibility in spherical 
harmonics. 


(x1 + ixe)”. 


* H. Weber, Mathematische Annalen, vol. 1 (1869), p. 7; Crelle, vol. 49 (1868), p. 222. 
t W. D. Niven, Transactions of the Royal Society of London, vol. 170 (1880), p. 379. 


1938] LAW OF THE SPHERICAL MEAN 205 


2. On solutions of >°c;L‘(V) =0 for general linear operators L. Applica- 
tion to symmetric solutions of (13). In this section we shall obtain explicit 
forms for symmetric solutions of (13), that is for solutions of (23). As noted 
above, these solutions play an essential role in the laws of the mean con- 
sidered in this paper. 

Before taking up (23) and its solutions consider the equation 


(26) L(u) — = 0, 
where Z is a linear functional operator, and ) is an arbitrary constant. It will 
be shown how solutions of (26) can be made to yield solutions of 


Pp 
(27) =0, cp 

i=0 
where c; are constants. By specializing to the case L=d?/dr?+(n—1)d/dr, 
(23) will be obtained. Applications to other operators Z occur in §9. 

The solution of (26) depends upon the parameter d as well as upon proper 
independent variables; consider a solution « which is analytic in \ at \=Xo. 
Expanding u in powers of \—Ao: 

o* k 
k! 
substituting in (26) written in the form 
(26’) [(L — Xo) — (A — Xo) |u = 0, 
applying Z term-wise to the right-hand of (28), and comparing coefficients 
of like powers of \—A» on both sides, we obtain the recurrence relations 
(omitting the subscript in Ao): 
0 for k = 0, 
1 
(k — 1)! ane 
provided the term-wise application of L is justifiable. From (29) it follows that 


for k>0O, 


Ou 
Or 


are solutions of the functional equation 
(31) (L — = 0. 


Now, consider the functional equation (27). Factoring the operator in 
(27), there results 


i 


206 HILLEL PORITSKY [March 


(32) «Li = — (L— ™, 

i=0 
where Ai, As, - - - are the roots of >-c:A‘=0, and m, me, - - - their respective 
multiplicities. Now obviously, solutions, say, of (L—\,)™V =0 are also solu- 
tions of (L—d1)™(L—).)™V =0, and, therefore, also of (27). Hence we con- 
clude that from the solution u of (26), the following p solutions of (27) may 
be obtained: 


O'u 
;#=0,---,m;—1;7 =1,---,h. 


(33) — 


It will be noted that the -differentiations in (30), (33) could be replaced 
by differentiation with respect to u, where u is a properly differentiable func- 
tion of \. Thus, the functions 

Ou 


30’ 
(30’) 


are solutions of (31). Indeed, from the differentiation formulas 


ou Oudr (=)= 
du Ory du anr?\du an) au? 


it follows that the functions (30’) can be expressed linearly in terms of the 
functions (30). 

Returning now to symmetric solutions of the differential equations (1), 
(8), (11), (13), we replace V? by 


dr? r dr ~ dr 


thus converting them into ordinary differential equations. Thus (1) becomes 


(34) of 

Two solutions of (34) are readily verified to be ¢,(r, A) given by (6) and 

(35) = (2k) [(2—m)(4—n) - - - (2—n+-2h)]’, 
k=0 


where 


* As explained in §1, I, (3) refers to equation (3) of the previously cited paper I. 


1938] LAW OF THE SPHERICAL MEAN 207 
1, if » is odd, or if m is even and k < (w/2) — 1, 


log r if m is even and k = (n/2) — 1, 


1 1 1 


1 1 1 
+ =) ifmis even and > (n/2) —1, 


and where the prime following the brackets [(2—m)(4—m) - - - (2—n+2h)] 
indicates that the factor zero that would occur in the indicated product for 
even ” and k=(n/2)—1 should be omitted. These solutions result in a natural 
manner when one attempts to integrate (34) by means of a power series in A 
admitting term-wise r-differentiation. The solution ¢, is analytic for all r 
and \; likewise for y, except for r=0. We denote the coefficient of \* in (35) 
by Va.x+1 (7) or by Vi4:(r) if the omission of the subscript ” leads to no con- 
fusion, thus: 


k=0 


vir, = alr). 


k=0 


Applying the conclusions established above regarding solutions of the 
functional equation (26) to the present case, there follows for k>1 


= Vin, 
(37) 2k-n(n + 2)--+ (n+ 2k — 2)] 
= (2k — 2)n--+ (n+ 2k — 4), 
while 
(37’) vVi=0, 
there also follows that 


are symmetric solutions of (11). These results are obtained for \)=0. Simi- 
larly for general \, are obtained the recurrence relations 


1 
(39) (k—1)! 


for k> 0, 


and the following symmetric solutions of (8): 


208 HILLEL PORITSKY [March 


Finally, we conclude that symmetric solutions of (13) are given by (16) and 
by similar derivatives of y,(7, d). 

The various solutions of the proper (ordinary) differential equations just 
obtained may be shown to be linearly independent thus furnishing a com- 
plete set of such solutions upon which any solution would depend linearly. 

For \~0 these solutions may be expressed in terms of Bessel functions. 
Indeed, by introducing v=r"/?-!w and letting y=(—A)"/*r (where either de- 
termination of (—\)'/? is used) equation (34) becomes 


d*y 1 dv (n/2 — 1) 
(= (1 = 0. 
yey dy 

The bracket will be recognized as the Bessel differential operator of order 
m=n/2—1, applied to v. Hence for \+0 solutions of (34) can be expressed 

linearly, say, in terms of 
y~"J_m(y) for odd, 


(41) m(¥); 
y-™Y,,.(y) for even, 


where 


y = (— m =n/2—1.* 


An advantage of the solution y, over the Bessel function form lies in its 
analyticity at \=0. The relation between ¢, and the Bessel functions is given 
by (7); the expression of y, in terms of the latter is given by 
Valr, d) ri m)(— ] for odd, 


1 1 (—A)!/2 
2 m 2 


for m even, 


where, as in (41) y=(—A)"/*r7, m=n/2—1. Similarly, for non-vanishing \, 
the symmetric solutions of (8) can be expressed in terms of Bessel functions, 
a set of solutions being furnished by 


* E. W. Hobson, Proceedings of London Mathematical Society, vol. 25 (1893), p. 49. Hobson 


suggests the term “rank” for n. 
The Bessel function notation is that of G. N. Watson’s Theory of Bessel Functions, Cambridge 


University Press. 


(40) t= 0,---, p— 1. 

(42) 


1938] LAW OF THE SPHERICAL MEAN 209 


J—em-i(y) for n odd ] m=n/2—1, y 


—m+i 
(43) y [ Jars), i=0,---,p-—1. 


Ynii(y) for even 
To prove this, replace the \-differentiations by differentiations with respect 
to log (A)?/? (or log (—X)"/*) thus operating on (41) with the operator 2\0/dX 
= yd/dy, and utilize the formulas 


(44) |/dz = — 2-°Jy41(z) 


and similar formulas for z’J,(z) and z-’Y,(z).* 

It may be concluded from the above that Bessel functions of order 
m=n/2—1 and argument y=(—))"/*r are linearly dependent on the func- 
tions y"¢,(r, d), y"W.(r, X). Hence the functions appearing in (43) and there- 
fore also in (40) can be expressed linearly in terms of y*‘¢,42:, ¥?‘Wn+2i. Thus a 
set of solutions of (8) for \+0 is also furnished by 


(45) A), d); 1. 


For \~0 this set is, again, linearly independent. 

The symmetric solutions of (13) can now be similarly expressed in terms 
of Bessel functions by means of the factorization (15’), a set of the form (43) 
or (45) corresponding to each repeated root. 

3. Laws of the mean for solutions of (13). Consider a solution of (1) in a 
spherical shell R,, or in a sphere R,. Applying the operator A to both sides 
of (1) for spheres concentric with boundaries there follows 


— dA(u) = 0. 


Now, by I, Theorem 1, the first term above may be replaced by V?[A (x) ]. 
Hence, A(x) also satisfies (1). Since A(u) is symmetric, we conclude that 


(46) A(u) = Con(r, 4) + d), 


where C, D are constants. The values of the latter depend upon the particular 
solution u as well as upon the position of the center. 

Thus it follows from I, Theorem 1, that if ~ satisfies (1) in a spherical 
region R,, then such is also the case with A(u). Since A(x) is of class C’’ at 
r=0, the constant D in (46) must now vanish. Putting r=0 to determine the 
remaining constant C, we obtain 


A(u) =C, 


r=0 


But 


r=0=Uo. Hence (5) results. For »=2 and 3 it yields (4) and (2) 


* Watson, loc. cit., p. 66. 


| 


210 HILLEL PORITSKY [March 


respectively. For \=0, that is for harmonic functions, (5) reduces to the 
Gauss law of the mean, while (46) yields 


(47) A(u) = C + DVi(r), 


which reduces to (25) for n=3. 

More generally, by applying the operator A to both sides of (13) and per- 
muting A with V*, V‘,--- (see I, §9) one proves similarly that if uw isa 
solution of (13) in R.» or in R,, then A (wu), likewise, satisfies (13) there. Being 
symmetric A(u) must therefore reduce to a linear combination of the func- 
tions (16) and of the corresponding y-functions. Thus, for the general case 
of (13), (15) 


(m, — 


We shall determine the coefficients for the case when wu satisfies (13) in Rs, 
that is for r <b, r=0 included. 

Applying (V?—)2)™ - - - (V?—A,)™ to (48) and utilizing (40) it is found 
that the ¢, y functions corresponding to the roots dz, As, - - - drop out. Apply- 
ing products of the above operator by (V?—A,)™~}, and utilizing (39), there 
results 
— — (WE Aa) (a) 

= (Ar — (Ar — + Dn W(r, 
Since A(u) is of class C?») at r=0, while the left-hand operator is of 
class C°?~), it follows that D,,=0. Similarly applying products of 
- (V?—A,)™ by , (V?—A1), one proves that 
Dm,-1,* + > , D, all vanish. Putting r=0 in the m, equations obtained yields 
successively the values C,,,, , Ci. 

A similar procedure eliminates the D’s and evaluates the C’s correspond- 
ing to the repeated factors Xs, As, - - - . There results 


— - - — ™ 
(m, — 1)! a 


A(u) = S$ 
(49) 


’ 
r=0 


where the operators V*, V‘,--- all operate only on « (but not on ¢, 
0/0, - - - ), and S indicates a summation extended to similarly constituted 
terms corresponding to the repeated roots As, - - - , Ax; the operators A have 
been suppressed from A(u), V?A(u),--- at r=0 by applying V?, V‘,-- - 
first and replacing the mean at a point by the function itself. 


1938] LAW OF THE SPHERICAL MEAN 211 


For special cases both the above proof and the results simplify. Thus if 
the roots are all alike, that is, in case (8), (49) reduces to (9). If the roots are 
all distinct, (49) becomes 


(V2 — Az) - (V2? — Ap) 
(50) 
- 


To obtain the forms of these laws of the mean given by (14) and its modi- 
fications indicated in §1, we proceed directly from (48) after the y’s have been 


eliminated, applying V?, - - - , V?-? to both sides, putting r=0, utilizing 
d%g(r, 
(51) 
On? On? 


and eliminating C;. The relation (51) is proved by interchanging the order 
of the \- and the V?-differentiations, replacing V?‘¢ by A‘¢, and putting r=0 
before carrying out the \-differentiations. 

4. The spherical mean of analytic and regular functions. Interpretations 
of the iterated Laplacians. We shall indicate two proofs of (17) for the mean 
of analytic functions. Consider the function 


u = exp [citi +--+ + 


where ¢;, - - - , C, and u» are constants. This function is a solution of (1) with 
A=c?+ --- +c,”. Applying the law of the mean (5) to w with the center O 
of the spheres at the origin, we obtain for all 7 


(52) A(exp + - + = on(r, +2. 


Now consider an arbitrary analytic function u; placing the origin at the center 
O, we may write the Taylor series formally thus: 


(53) u = exp [x1(0/x1) + --- + |uo, 


where the exponential is to be expanded as an -fold power series, each term 
multiplied by mo, and the formal product then replaced by the corresponding 
partial derivative of u at the origin. Since the convergence for r less than a 
proper p>0 is uniform, we may average term by term; the result is, there- 
fore, the same as the average for the above example written as an n-fold 
power series in c; where the latter are replaced by the fictitious quantities 
0/dx; and the resulting terms interpreted as above. Hence, for r<p (18) fol- 
lows, or, more explicitly, (17). 

A more direct proof of (17) is obtained by expanding A () in powers of r?: 


212 HILLEL PORITSKY [March 


(54) A(u) Co + Cor? 


and determining the constants C.; by applying V?‘ to both sides, replacing 
V2iA(u) by A(V?‘x), putting r=0, and utilizing the latter part of (37). To 
justify the expansion (54), write the Taylor series of « in the form 


55 > 1 ( 4 4 
“= — Xn u => 
( ) k=0 k! Ox, 0 n k=0 k! 


where 0/dr denotes differentiation along the ray from the origin through 
41, °° *,%n, With respect to the distance 7 from the origin. Upon taking means 
of both sides a series such as (54) will result, since, for odd & the contributions 
from opposite directions toward the mean will cancel each other. 

Comparing the coefficients of the powers of r? in (17) with those obtained 
by averaging the last member of (55) we obtain (19). To derive the first rela- 
tion (20), transpose the first term on the right of (17) (k=0) to the left, 
divide by r?, and let r approach zero; the further relations (20) are derived in 
a similar fashion by utilizing the preceding ones. 

The interpretations (19), (20), for V?, V‘,--- are proved for non-ana- 
lytic functions of class C°” for proper k by replacing (53), (55) by finite 
sums with a remainder term of the form o(r?*) and deriving from these a simi- 
lar modification of (17). 

From (17) are readily derived the formulas* 


(56) fuas K > 
2-4-- + 2)--- (+ 2k — 2) 


(57) f udv = K > Bet 
2°4--+ 2k-n(m + 2)--- (n+ 2k) 


for the integrals of u over the surface and the volume of S. Likewise by differ- 
entiating (17) with respect to r, then multiplying by S, one may obtain simi- 
lar series for the surface integrals over S of the normal derivative of u of any 
order. 

5. Applications of (17). As a further application of (17), consider the 
means A;(u), A2(u) of analytic functions over “subspheres” lying 
in two mutually totally perpendicular flats of m, n—m dimensions: 


(58) + = 7?, = = = 0, 
(59) a2 = 2’, = 0. 


Application of (17) (for sufficiently small 7) yields 


* Here, as in I, K, denotes the “area” of a unit sphere in m-dimensions. 


1938] LAW OF THE SPHERICAL MEAN 213 
(60) Ax(u) = >> (Dy*u)or?*/2-4--- 2k-m--- (m+ 2k — 2), 
k=0 


(61) A2(u) = > u)or2*/2-4+-- 2k-(n—m)--- (n—m+ 2k — 2), 


k=0 
where 
(62) D, = +--+ + De = +--+ + 


It will be shown that in certain cases there exists a linear functional relation 
between the means A;(u), Ao(u). 
Thus, if « is harmonic, 


Du = 0, 


whence 
Dyu Deu. 
Applying D, to both sides, there results 
= — = — DD = D?u, 
and, by induction, for any k 
(63) = (— 1)*DFu. 


There exists thus a definite ratio between the coefficients of r? in (60) and 
(61): this proves the above statement. 

Denote Ai(u), A2(u) by fi(r), f(r). For even and m=n/2 (and harmonic 
functions u), the functional relation reduces to 


(64) filr) = felir). 


For other cases, the relation between fi(r), fe(7) may be obtained as follows: 
Consider the series 


(65) g(r) = (4) 
P(P + 2)--- + 2k — 2) 2/ ino vr (242) 


for integer », where the C; do not vary with p. One may express gp,,(r) for 
g>0 in terms of g,(r) as follows: 
+ 9)/2] 


This relation is even simpler if the functions 


(67) Gp(r) = 


214 HILLEL PORITSKY [March 
are introduced, whereupon (66) is replaced by 
r2 
0 


Either (66) or (68) is readily established by termwise multiplication and in- 
tegration, utilizing the Eulerian integral of the first kind; the results are 
valid at least in the circle |r| <p in which g, is analytic. The relation (68) 
will be recognized as an Abel integral equation; it is related to “fractional 
integrals”* and will be recognized as equivalent to 


d —q/2 
G,(r). 


(68") Grae) = | 
For even g the solution of (68) is given by 


d q/2, 
Send, 


(69) Gin =|55 


while for odd g the fractional integrations or differentiations cannot be elimi- 
nated, and the solution of (68) is given by 


d 
G,(r) = 
la] 


d (qt+1)/2 r2 
- Fal (r? — 


(69’) 


To apply the above to the means f(r), fo(r) for m#n/2, suppose that 
m<n/2, and put p=m, p+q=n—m, =fi(r), =f2(ir), (or else put 
=filir), S+e(r) =fe(r)). There results 
— m)/2|r2-»+™ 

T'(m/2)T(n/2 — m) 


(70) falr) = — 


and, in particular, for even m and m=n/2—1, 


* See, for instance, the author’s paper on Heaviside’s operational calculus, American Mathemati- 


cal Monthly, vol. 43 (1936), pp. 332-334, 339. 
+ Another way of expressing the linear functional relation in question is by means of 


1 


where F is the hypergeometric function, and the integration is carried out over a circle | s| =const. 
on and within which f, is analytic, and for |r| <|s|. This relation holds for any integer p, ?’; re p’. 


1938] LAW OF THE SPHERICAL MEAN 215 


(71) = C 1) f 


The special case of (71) n=4 (and hence m=1), as well as the special case 
of (64), »=4, m=2, has been noted by H. Bateman.* 

Suppose next, that the function wu, instead of being harmonic, is an ana- 
lytic solution of Dju=D,.u, that is of 


72 (+--+ 5) 
Equation (63) is now replaced by 

(73) Di (u) = D# (u), 


and the functional relations between f(r), f2(7) become even simpler. Thus, 
for even m and m=n/2, (64) is replaced by 


(74) filr) = falr);t 


while in applying (67)-(69’), gp, gp+q are replaced by /f,(r), fo(r) directly, so 
that for even (69) yields 


(75) mp) |= 


— m)/2] La(r?) 


The proof of the above results is based upon the analyticity of u and of 
filr), fe(r); this is necessarily the case with harmonic functions and their 
means; however, solutions of (72) need not be analytic in all the variables. 
That the results obtained for solutions of (72) do apply, whether they are 
analytic or not, follows from the fact that one may approximate to a function 
and its first and second derivatives by means of an analytic function and its 
derivatives. 

Applying (74) to the one-dimensional} wave equation 


(76) 


one obtains 


* H. Bateman, Some geometric theorems, etc., American Journal of Mathematics, vol. 34 (1912), 
pp. 332-334. 

+ This result is due to L. Asgeirsson, Mathematische Annalen, vol. 112 (1936). 

t A one-dimensional “sphere” along the 1-space (x-axis) is the locus (x—x9)?=r?, where xp is the 
center, r the radius; it consists of the two points xo+1, x»—r. The “spherical mean” of a function u(x) 
over it is to be understood as [u(xo-+r)-+u(xo—r)]/2. 


Ox? 


216 HILLEL PORITSKY [March 


u(x + ct’, t) + u(x — ct’, t) _ u(x,t + t’) + u(x, t — t’) 
2 2 


(77) 


For the three-dimensional wave equation 


(78) (= + + ~) 
7 
ot? Ox? Ox? Ox? 


a relation similar to (71) yields 


rie 
(79) A(u)| = f uo(t + t’)dt’/(2r/c), 
t —r/e 
where the left-hand member is the mean of u at the time ¢ over a sphere of 
radius 7, while the right-hand member is the time-average of u at the center 
over the time interval t—r/c, t+r/c.* 

6. Inversion of the averaging process. The relations between means over 
spheres in two mutually perpendicular directions considered in the preceding 
section are also of interest in connection with the inverse problem of spherical 
means. Of course, the operation A does not possess a unique inverse, since 
many different functions can give rise to the same spherical mean. However, 
if the function wu is properly restricted, then the operation A may be inverted 
uniquely. Thus we may ask: what even function f(x;) of the single variable x, 
will, when averaged over spheres with center at the origin, give rise to a given 
function f2(r) ? 

Suppose the f, is analytic in r? and f, in x. The following relations hold 
at the origin (that is at the center of the spheres) : 


d**f, 


dx?* 0 


= Jo = A(V**fi)o = fio = 


Hence between the coefficients of the expansions of f; in powers of x? and 
of fz in powers of r? exist the same ratios as between the coefficients of powers 
of r? in (60), (61) for the case (73) provided m, n—m are replaced by 1 and n 


respectively. 
Similarly, if it is supposed that m is an even function of 


(80) (x? + xf + m<n, 


the coefficients of the expansions of A(u) in powers of r? and of u in powers 
of x? will have the same ratios as the coefficients of the powers of r? in (60), 


* The time ‘—r/c corresponds to the moment at which spherical wavelets should start from the 
points of the sphere, so that as they diverge with the characteristic velocity c, they will arrive at the 
center at the time ¢; similarly for ¢+-r/c and converging spherical wavelets. 


1938] LAW OF THE SPHERICAL MEAN 217 


(61) for the case (73) provided m, n—m are replaced by m, n. Carrying out 
the necessary modifications in (66), there results 


T(m/2)T [(n — m)/2] Jo 


The relation (81) may be established directly geometrically as follows: 
Break up the spheres of integration S, implied in A by means of the loci 
x=const., and let 


(82) y= (xe 


and 


(81) A(u) = 


s™—24(s)(r2 — 5%) (52) , 


ds = (dx? + dy?)!/2 = dxr/y, 


since x?+-y?=r?. The cylindrical shell between x and x+dx has a base lying 
iN %m41= =X,=0 of m-dimensional content 


dB = Knx™—dx, 


while upon each point of this base is “projected” an (m —m)-dimensional sub- 
sphere of S, of radius y and of content 


m—1 
The area of S intercepted by dB is therefore 
dS = dB sec 8, 


where @ is the angle between the normal to S and the projecting lines, so 
that sec 0=r/y. Hence 


dS = 
and 


(81’) A(u) = f — 

This is readily reduced to (81). 

The uniqueness of the averaging process may now be based upon the 
known facts concerning the solution of (81). 

The results of this section will be applied in the paper On integral repre- 
sentations, etc., mentioned at the end of §1. 

7. Converse theorem. We start the proof of the converse theorem de- 
scribed in §1 by noting that if in 


fir) = polynomial of degree p in r? + O(r?) 


we replace r? by x2 + - - - +4,?, then f;(r), regarded as a space function in a 


218 HILLEL PORITSKY [March 


Euclidean n-dimensional space in which ¢ is the distance from a point P, is 
of class C‘*”) even at P. Applying (19) there follows 


f(r) = 


r=0 


where b, #0. Now, consider the square matrix m 
f(r) ; 1,2,---,P, 


where the Laplacians are obtained by forming space functions out of f,(7) in 
the manner explained; suppose, for definiteness, that 7 represents the order 
of the columns. Assume that not all the elements of the first row vanish. 
Then, by a reversible linear transformation on f;(7) (consisting in permuting, 
if necessary, two functions, then dividing the first function by its value at 
r=0, and adding a constant multiple of it to the others) it is possible to 
transform them into a new set of functions for which the corresponding 
matrix has for the elements of the first row the numbers 1, 0, - - - , 0. Simi- 
larly, if in the new matrix not all the elements of the second row beyond 
the first element vanish, it is possible to obtain » new functions related to the 
former ones by a reversible linear transformation and such that their matrix 
has for its first two rows the first two rows of the unit matrix: one needs only to 
transform the functions beyond the first one in a fashion similar to the above 
and then add to the first function a proper constant multiple of the second 
one. This may be continued till either 

(a) the matrix m has been transformed into the unit matrix, or else, 

(b) the matrix m has been transformed into a matrix whose first p’—1 
rows, p’ S p, are rows of the unit matrix, while in row p’ the principal diagonal 
term and all the terms following it vanish. 

Denote the new functions of r into which f,;(r) have thus been transformed 
by F;(r), and write in place of (21) 


(83) A(u) = D,O)Fi(r) + + D,O)F,(7). 

center at O 
In case (a) consider (83) for an arbitrary but fixed O. Interpreting each mem- 
ber as a space function in the neighborhood of O, apply V* (j=0, 1,---, 
p—1) to both sides, and put r=0. Replacing A(u)|,-0 by A(V?x)|,-0 
hence by V?z| ,o, we obtain at O (hence everywhere inside R) 


D,(O) = u,--- , D,(O) = 


| 


1938] LAW OF THE SPHERICAL MEAN 219 


Applying V? once more to (83), putting r=0, and utilizing the results ob- 
tained, we get 


(84) V?A(u) = V?4 = VFi(r)| + 
r=0 r=0 
We have thus proved that wu satisfies an equation of the form (13). 

In case (b) we obtain in a similar way from the first p’ applications of V? 
a differential equation for u of the form (13) but of lower “order” p’. Applying 
the results of §3 it follows that A(z) is linearly dependent on p’ <p func- 
tions of r. This case therefore reduces to case (a) with a value of » lower than 
the value initially used. 

As an example, suppose that for u of class C’’, A(u) is proportional to 
the same function f(r) for any concentric spherical family irrespective of the 
position of the center; f is supposed to be of class C’’ in r and f’(0) =0. Then 
(84) yields 


Vu = (r) 


) u, = f(r)/f) 


or 


vu = [nf’(0)/f(0) 


In particular, if f’’(0) =0, « is harmonic. 

We close this section by considering the interesting question as to whether 
there exists a theorem similar to the one just proved but converse to those 
results of §3 for which the averages A(u) are taken over spheres lying in a 
spherical shell R.,, and enclosing the inner sphere, so that A(u) is linearly de- 
pendent on functions f;(7), not all of which are regular at r=0. Thus, in the 
plane, if « is harmonic in a ring R,,», the average over concentric circles of 
radius 7 and enclosing the inner boundary r =a is given by A log r+ B, where 
A and B are constants whose value depends upon the position of the center. 
Now suppose conversely, that u is of class C’’ in R,,, and that the average 
for circles of above description is given by A log r+B; could one infer that 
is harmonic? That such need not be the case can be seen from the following 
example. 

Let u=% log (x?+2%2), so that V2u=4a,/(x2 Viu=0; u is bi- 
harmonic but not harmonic. It will be shown that in spite of « not being 
harmonic, A(x) is of the form A+B log r for any family of circles enclosing 
the origin. 

Since V'u=0, A(u) =AVi(r)+BV2(r) +C+Dr’, where A, B, C, D, are 
constants depending upon the position of the center. Holding the latter fixed, 
applying V? to A(u), and replacing V?A(u) by A(V*u), there results 


220 HILLEL PORITSKY [March 
A(v?u) = BVi(r) + 4D = B logr+ 4D. 


Letting r become infinite, it follows that B=D=0. since V*u vanishes at 
infinity. 

8. A generalization of (17). In this section is established the following 
generalization of the law of the mean (17): 


A=V? Ox OXn 


(85) A(uP:) = 
Here wu is analytic at the origin, which point will be supposed to be the center 
of the spherical family implied in the averaging operation A; P; is a homo- 
geneous polynomial of degree k; the indicated term on the right-hand side is 
obtained by replacing \ by V? in the A-power expansion of 0*~‘¢/d\*-‘, multi- 
plying the resulting series termwise by V*‘P; in which 2, %2,--- have been 
replaced by 0/0x, 0/0x2, - - - , multiplying the result by uo, then interpreting 
each term by applying the indicated differentiation to u at the origin; [x], 
as usual, denotes “the greatest integer in x.” 
For the case 


P, = 


where H, is harmonic, the above reduces to 


d*bn(r, d a a 
=v? Ox, OxX,/ 


and introducing Bessel functions by means of (10), 


(86) A(uH;) = 24 


A(uH,) =T (<) (A1/2) (n/2-1) 
2 
(87) 


A=V 
To prove the above, consider as in §4, the special case 
“u = exp + + CnXn 
putting again 


Since 
V2(Px exp + cove + +++ + 
= exp + +--+ + Pe + 


1938] LAW OF THE SPHERICAL MEAN 221 


it follows that 


(v? — )(Px exp + --- ]) 
= exp + |[V? + + --- Py 
and hence by induction 
(v2 — d)(Px exp + --- J) 
In particular 
(v2 — d)*+1(P, exp [aa +--- =0. 
Hence, the mean of P,exp[cizit+ - -- ] may be evaluated by means of (9). 
The last term of the right-hand sum of (9) reduces to 
k! o 
since P; is a homogeneous polynomial of degree &. A similar reduction in the 
other terms results in 
(Si 


where [x] denotes the greatest integer not exceeding x, and (V?*P;)(¢1, - - -,¢n) 
is the result of replacing the x’s by the c’s in the (k —27)th degree polynomial 
V7P,. 

Having thus established (85) for an exponential function, the case of a 
general analytic function is now deduced from the above, as in (24), by means 
of the formal representation of the Taylor series by means of an exponential. 

The result (86) (for 1 =3) is due essentially to Hobson.* It will be utilized 
in the following section in connection with obtaining analogues of the results 
thus far obtained for means and the functional operator A, to other functional 
operators considered in I. 

An interesting application of (85), (86) consists in examining the order 
of vanishing with r of the means A(uH;,), A(uP;). It is found that 


(uH;) = O(r?*), 


(88) 
A(uP,) = O[r‘*-*+)] according as k is (even, odd). 


More precisely, 


* Hobson, Proceedings of the London Mathematical Society, vol. 24 (1892-1893), p. 80. 


222 HILLEL PORITSKY [March 


a 
(89) lim = ( ~) uo, 


A (uP) uo(V?)*/? Py 
lim = for even k, 
ro 2:4---k-n(n+ 2)--- (n+ k — 2) 
(90) 4 a a 
( uo 
A(uP,) Ox OXn 
lim = for odd k. 
ror —1) 


9. Extension to other operators. Let L be any linear functional operator 
which is permutable with V?. If ~ is a solution of (1), then Z(u) likewise 
satisfies (1). This follows (as in §3 for the case L= A) by applying L to both 
sides of (1) and permuting Z with V?: 


L(v?u) — AL(u) = 0 = Y?(L(u)) — AL(u). 


Similarly, it is shown that L(m) satisfies (8), (11), or (13) if « does and if Z 
is permutable with V’, V‘, - - - . In many cases this leads to results as defi- 
nite as the laws of the mean proved above (properly speaking, however, the 
results are not covered by the title of this paper). 

Consider first the operator L, (see I, (5)): 


(91) L,(u) = in(w) f hi (w’)u(r, w’)dw’ = h,(w)I(r). 


Since (see I, (7)) 


dr? r dr r? 
it follows that if u satisfies (1), 7(r) will satisfy the ordinary differential equa- 
tion obtained from (1) by replacing V? by 
a? m—-1d k(2—n—k) 


dr? r dr r? 


(92) 


similarly for solutions of (8), (11), (13). Thus in case of solutions of (8), I(r) 
satisfies 


mn-1d k2—k—n) 
(93) | + alr =o. 


dr? r dr r? 


Proceeding in a manner similar to that employed in §2 for the case k=0, 
the following alternative forms for solutions of (93) may be obtained 


1938] LAW OF THE SPHERICAL MEAN 223 


W), i(r, 
(04) d) i=0,---,p—1. 


Here B,(x) denotes a Bessel function of order s and argument ~. All the three 


forms (91) apply for \0; the two latter also for \=0. For \=0 the above 
yield the p-harmonic polynomials 


(95) Hy, Hyr?,--- , Hyr??-?,* 
while from the last form (94) are obtained the p-harmonic functions 
(96) = 0,---, p — 1, 
where 

Inr if the power of r is even and non-negative, 
(97) 

1 otherwise. 


The general case (13) can now be handled by factorization (see (15), 


(15’)). 


The operation L;, or more explicitly, its component 
I(r) = (w’)u(r, w’)dw’, 


differs only by a factor K,/r* from the operation A(uH;%), Hi =/yr*, con- 
sidered in the preceding section. 

Combining the results indicated in this section with (86) one obtains for 
analytic solutions of (13) 


I(r)n(n + 2)--- (n+ 2k — 2) 
K,r* 
Ox OXn 
(98) 0 0 = 0 
Ox, OXn 
0 0 
Ox, OXn 


* It is known that such products of harmonic polynomials by powers of r? suffice to yield a 
complete set of p-harmonic polynomials of any degree. See Almansi, Sull’integrazione, etc., Annali 
di Matematica, (3), vol. 2 (1899), §3. 


224 HILLEL PORITSKY [March 


provided that the factorization of the left-hand member of (13) has no re- 
peated roots. For k=0 this reduces to (14). 

Turning to other operators discussed in I, consider the operators A, (see I, 
§7), which generalize L, to non-integer k: 


(99) Ax(u) = f hi (’)u(r, 


here h,, hi are solutions of I, (12): 
(100) Ach = — k(k+n—2)h 


along (Riemann surfaces spread over) the unit sphere. It is found that for 
solutions of (8) the differential equation (93) still applies to J(r), but of 
course, with the proper value of k, leading to Bessel function solutions as 
displayed in the first form (94), but of non-integer order. 

Consider next the operators L;.,,-: (see I, §6): 


+00 
where /,, h¢ are functions of x, x2, - - - , X,-1 Satisfying the differential equa- 
tion : 
Ox? Ox,2_1 


When L;.,-: are applied under the conditions of I, Theorem IV to solutions 
of (13), it is found that J(x,) satisfies the equation obtained from (13) by 
replacing V? by 


(103) d2/ax2 +k. 


Consider finally non-Euclidean spaces NV, and operators permutable with 
A», the second invariant differential operator of Beltrami. To them belong 
the operators L,, L,*, L** of I, §8, applied respectively over spheres, horo- 
spheres, and equidistant surfaces, the two latter in the Lobatchevsky space. 
Here L, is given by (91), where 7, w are spherical coordinates, and h,, hi 
satisfy the equation (100) over a Euclidean unit sphere. Applied to a solution 
of 


(104) Aou — du = 0, 


the operator L, leads to a function J satisfying the equation 


dl (/rk(k+n—2) 
(105) 1) cot or ( ) 
dr? dr 


sin? cr c? 


1938] LAW OF THE SPHERICAL MEAN 225 


The operator L;* is applied over equidistant horospheres ¢=const. with the 
element of length 


(106) ds? = e?€(dx? +--+ + dx 21) + dé; 


it is given by (101) with x, replaced by é, where h, h’ satisfy (100). Now, a 
solution of (104) leads to J satisfying 


/ — n — — = 
dé dé 


The operator L,;** leads to (105) but with rc replaced by 7/2—r/c. 
Let us map the non-Euclidean space NV, on a Euclidean sphere in E,,,, of 
radius R=1/c and define a function v in EZ,,,; thus: 


(108) v= RM. 
If u satisfies (104), then 


ROR R? 


= [9(g + 2 — 1) + d]o/R?. 
Hence if g is chosen as a root of 
(109) 


then v is harmonic in £,4:. This relation will be utilized in the paper On in- 
tegral representations etc. cited in §1. 

In concluding we recall briefly the extensions of some of the above re- 
sults to certain discontinuous functions. Consider, for instance, for »=3 the 
function 1/7, where 7; is the distance from a fixed point P;. The operations A 
or L; applied over spheres with center at P yield harmonic results to each 
side of the sphere S; passing through P;. From the interpretation of L;.(u) 
in terms of the distribution of the unit mass at P; over S, (see I, §9) follows 
that I(r) is continuous at r=PP,, but that its derivative is discontinuous 
there by the amount [h/ (w,), the value of h/ at P,|/4x. A convenient way 
of proving this and other similar results is by spreading out the concentrated 
(point) mass over a finite region, thus representing 1/r as the limit of a func- 
tion with a continuous Laplacian. 


GENERAL ELEcTRIC COMPANY, 
ScHENEcTaADyY, N. Y. 


THE SCHONEMANN-EISENSTEIN IRREDUCIBILITY 
CRITERIA IN TERMS OF PRIME IDEALS* 


BY 
SAUNDERS MacLANE 


1. Introduction. The Eisenstein criterion for the irreducibility of a poly- 
nomial has been repeatedly generalized, in many cases by the use of Newton 
polygons. All of these irreducibility criteria for polynomials can be sys- 
tematically viewed in terms of non-archimedean absolute values—so that 
we can state a general theorem which includes all these theorems as special 
cases and which also establishes the irreducibility of new classes of polyno- 
mials. Our general theorem asserts, in effect, that a polynomial G(x) with no 
multiple roots and with rational coefficients is irreducible if there is a rational 
prime / which has just one prime ideal factor in the ring R[x ]/G(x), which is 
obtained by reducing modulo G(x) the ring of all polynomials with rational 
coefficients. This criterion can be constructively applied by using a previously 
developed method for actually exhibiting the prime decomposition of any p.t 

The known irreducibility criteria are simply conditions which imply that 
the first few stages of the prime ideal construction will show that » has but 
one prime ideal factor. The Schénemann criterion asserts the irreducibility 
of polynomials of the form 


(1) f(x) = o(x)* + pM(x), 


where ¢(«) is an irreducible polynomial modulo / and where M(x) is a poly- 
nomial relatively prime to ¢, mod , and of degree less than the degree of f. 
Alternatively, these conditions show that in the ring R[x ]/f(x), p has just one 
prime ideal factor P=(p, ¢(x)), and that this factor may be found by the 
“second stage” of the construction of the factors of p. Because there is but 
one prime factor,§ and because the degree of P times the exponent to which 
P divides p is the degree of f(x), f(x) must be irreducible. 


* Presented to the Society, January 2, 1936; received by the editors February 23, 1937. 

} For a simple statement see B. L. van der Waerden, Moderne Algebra, §22. 

tS. MacLane, A construction for absolute values in polynomial rings, these Transactions, vol. 40 
(1936), pp. 363-395; S. MacLane, A construction for prime ideals as absolute values of an algebraic field, 
Duke Mathematical Journal, vol. 2 (1936), pp. 492-510. We refer to these two papers as Const I 
and Const II, respectively. They contain the definition of absolute values, etc., used subsequently. 

§ The connection between the Eisenstein irreducibility criterion and the prime ideal factoriza- 
tion of a rational prime was observed by M. Bauer, Zur allgemeinen Theorie der algebraischen Gréssen, 
Journal fiir die Mathematik, vol. 132 (1907), pp. 21-32, especially §IV; also by O. Perron, Idealtheorie 
und Irreduzibilitat von Gleichungen, Mathematische Annalen, vol. 60 (1905), pp. 448-458. 


226 


SCHONEMANN-EISENSTEIN CRITERIA 227 


Our new irreducibility criterion may be stated with reference to a rational 
prime ? or, alternatively, in terms of the corresponding “p-adic” absolute 
value. This simple form of the theorem is stated in §2 for a polynomial with 
coefficients in any field K. It involves certain absolute values of the poly- 
nomial ring K [x]. We include also a more general theorem giving all possible 
degrees for the factors of a reducible potynomial.* Next, in §3, we indicate 
how our result includes both old and new cases. To establish the prime ideal 
interpretation, we first develop briefly in §4 the properties of prime ideals in 
a ring K[x]/G(x), where K is an algebraic number field. These properties 
give the irreducibility theorem in the prime ideal form. Finally in §5 we show 
how the successive “approximant” values used in our irreducibility criteria 
do, in fact, give a construction for the prime ideals in the corresponding ring 
K [x|/G(x). Hence the general irreducibility theorem, stated in terms of ab- 
solute values, implies the form of the irreducibility theorem already stated 
in terms of prime ideals. 

The fundamental irreducibility theorem of §2 can also be applied to poly- 
nomials in several variables. In the last section we give several specific ex- 
amples of the new irreducibility criteria which result. 

2. Irreducibility criteria with approximants. An absolute value of a ring is 
a function V(a) defined for all a in the ring and with the properties 


V(ab) = V(a) + V(b). = min (Va, Vb). 


An element a of the ring is equivalence-divisible in V by an element 6 if there 
is an element c with V(a—bc) > V(a) =V (dc). 

Consider now polynomials with coefficients in any field K. A polynomial 
f(x) isa key polynomial over a value V of K [x] if f(x) has the first coefficient 1, 
if any polynomial equivalence-divisible by f(x) in V has a degree at least as 
great as the degree of f(x), and if any product equivalence-divisible by f(x) 
in V has a factor equivalence-divisible by f(x) in V.t The first form of our 
general irreducibility criterion for polynomials is 


THEOREM 1. If K is any field and if G(x) is a key polynomial over a value V 
of the polynomial ring K |x], then G(x) is irreducible. 


Proof. Suppose that G(x) could be factored as G(x) =f(x)h(x). Then this 
product is equivalence-divisible by G in V. As G is equivalence-irreducible, 
by the definition of a key, one of the factors f or kh must be equivalence- 


* Simple theorems of this type have been stated by Dumas and Ore (cf. §3 below) and by 
O. Perron, loc. cit., and E. Netto, Ueber die Irreductibilitat ganzzahliger ganzer Funktionen, Mathe- 
matische Annalen, vol. 48 (1896), pp. 82-88. 

t MacLane, Const I, Definition 4.1. 


} 


228 SAUNDERS MacLANE [March 


divisible by G. But G is also minimal, so that this factor has a degree at 
least that of G. Therefore the assumed decomposition is trivial, so G is ir- 
reducible. 

The relevance of this Theorem derives from the possibility of explicitly 
constructing all possible values of K[x] for many fields K and from an ex- 
plicit criterion* which determines when G(x) is a key polynomial over such 
values. Any value V in K [x] determines a value Voa= Va in the coefficient 
field K. The simplest values of K [x] are the “inductive values” + V; charac- 
terized by the properties: (i) V; agrees in the field K with a given value Vo; 
(ii) gives certain polynomials ¢;(x) =, ¢2(x), - - - , specific assigned 
values V.¢;(x) =y;; (iii) V;, assigns to every polynomial in the ring K [x] the 
smallest possible value consistent with the conditions (i) and (ii). Such an 
inductive value is denoted by 


(2) Vi = [Vo, Vix = wr, = , = 


These values may be obtained by an inductive definition of the values V; of 
K [x| determined by the first i polynomials ¢;(x). In (2), each yw; is a number, 
while each polynomial ¢;(x), with 7>1, must be a key polynomial over the 
previous inductive value V;_:, and must satisfy two other minor conditions 
(Const I, Definition 6.1). The set of all numbers v= V;,f(x) —Vig(x) which 
are values of rational functions f(«)/g(x) in V; is a group, the value-group Tx. 
For a given ¢,;(x) there is for each polynomial G(x) a unique “expansion” in 
powers of ¢,, of the form 


(3) G(x) = gmx) 6" + + gol), 


where the coefficients g;(x) in the expansion are 0 or of degree less than the 
degree of ¢,. The value V,G is the minimum of the values of the terms in this 
expansion. 

The hypothesis in Theorem 1 that G is a key polynomial is most easily 
fulfilled by making a suitable multiple of G have a residue class modulo V 
(Const I, part IT) which is a linear polynomial. 


THEOREM 2. If V;, is an inductive value with a last key polynomial $;, and 
if a polynomial G(x) has an expansion (3) in terms of $;, such that (i) gm(x) =1; 
(ii) ViG=Vibs"=V igo; (iii) if n<m is a positive integer, then np, is not in 
the value-group T,-1; then G(x) is a key polynomial over V;, and hence irreduci- 
ble. 


* The method of Const I, Theorem 9.4 and §13 applies whenever V is an inductive value and 
K an algebraic field. 
+ For the explicit definition, see MacLane, Const I, §4, (3) or Const IT, §2, (4). 


| 


1938] SCHONEMANN-EISENSTEIN CRITERIA 229 


Proof. Conditions (i) and (ii) will make G(x) a key, by Const I, Theorem 
9.4, provided we also show G(x) equivalence-irreducible in V,. But any G 
has by Const II, Theorem 4.2 a representation as a product of key poly- 
nomials ¥(x). Each one must have the form of an expansion 


V(x) = bt + + ho(x), e>0 


in which the first and last terms again have the same value in V;. By the 
minimal property (iii) of m this is possible only if ¢ is a multiple of m. Hence 
the representation of G has just one factor y. This factor has the same de- 
gree and the same equivalence-divisors as G, so that G, like y, is equivalence- 
irreducible and therefore a key polynomial.* These theorems include and gen- 
eralize all the classical irreducibility criteria of the Newton polygon type. 

To obtain information about the degree of possible factors of a reducible 
polynomial, we use “approximants.” If V; is a finite and homogeneous} &th 
stage value of K [x], and if G(x) is any polynomial expanded as in (3), con- 
sider the exponents j in (3) with the property that V.G=V;.(g;(x)¢,’), and 
denote by a the largest and by 8 the smallest of these exponents. Then the 
projection of G on V; is taken to be 


(4) proj (Vi,G) =a— 6B. 


The homogeneous value V;, considered as an extension of the value V> of K, 
is an approximant of G over V, if and only if proj (V;, G) >0. For any poly- 
nomial G(x) with coefficients in an algebraic number field, the set of all &th 
approximants can be found by Newton polygons with the procedure used in 
Const II for an irreducible polynomial G(x). 

A homogeneous inductive value V, is non-finite if V.d.=",= ©. We call 
such a V, an improper kth approximant to G(x), for any k2s, if ¢, is a factor 
of G(x). We define proj (V,, G) as the exponent to which ¢, divides G. Hence- 
forth the phrase “kth approximants” refers both to proper and improper Ath 
approximants. 

To interpret (4), note that the integer 8 can be uniquely characterized by 
the properties 


is an equivalence-divisor of G(x) in 


(4a) ¢f+ is not an equivalence-divisor of G(x) in V;. 


* This could also have been proven by the method of Const I, by finding an R(x) so that R-G 
has value 0, showing (Theorem 12.1) that R-G has a residue-class which is a linear polynomial and 
thus proving G equivalence-irreducible by Lemma 11.2. 

t Homogeneity is required because every inductive value with a discrete Vo is equal to one and 
only one homogeneous value (cf. Const I, p. 393). 


230 SAUNDERS MacLANE [March 


For all terms to the right of gs(x)¢,f in the expansion (3) certainly have larger 
values than this term, hence the first half of (4a). If in addition the second 
half of (4a) were false, there would be an h(x) with f(x) ~h(x)o/+! in Vi; 
that is, with 


Vilf(x) — |] > = 


By inserting in f =hp£+!+ [f—hp£+'] the expansions in powers of ¢; for h(x) 
and for the term in brackets we get an expansion for f(x) in which the last 
term with the minimal value has the exponent 6+1 or greater, counter to 
the definition of 8. Therefore (4a) characterizes* 6. 

For a product G=f’(x)f’’(x), each factor f’, or f’’, has exponents a’ and ’, 
or a’’ and B’’, as in (4). By (4a), 6’ is the power to which ¢, is an equivalence- 
divisor of f’, and similarly for B’’. Therefore, by the uniqueness of the equiva- 
lence decomposition, 8=6’+’’ (see Const II, §4). Furthermore a is the 
“effective degree” of G(x) and has the property a=a’+a’’ (see Const II, 
§4, (1)). Combining these, we have 


(S) proj (Vi, = proj (Ve, f'(*)) + proj (Ve, f’(x)). 
This also holds for improper approximants. We obtain at once 


Lemma 1. The kth approximants of a product G(x)H(x) over Vo consist of 
all the kth approximants of G(x) and of all the kth approximants of H(x) over Vo. 


THEOREM 3. The degree-of-factors theorem. If for a value Vo of the field K 
all the kth stage homogeneous approximants to the polynomial G(x) over Vo are 
denoted by V;,,i=1,--- , ty, then any factor f(x) of G(x) has a degreet 


(6) deg fz) = deg 


i=l 
where each c; is any number satisfying 
0 cs proj (Vr , G), 
Proof. This will follow at once by (5) combined with the equation 
(i) 


(7) deg G(s) = proj G)-deg 


i=l 


for any polynomial G(x). To establish (7), decompose G(x) into its irreducible 


* A longer proof of this fact is also given in Const II, Theorem 5.1. 
+ Here deg ¢(Vi) represents the degree of the last key polynomial of V;‘*). It is equal to the 
degree of the residue-class field of V;, multiplied by the exponent of Vis. See Const IT, §9, (1). 


tk). 

H 


1938] SCHONEMANN-EISENSTEIN CRITERIA 231 


factors, apply Const II, Theorems 5.2 and 5.3* to obtain equations similar 
to (7) for each such irreducible factor, and combine these equations by using 
the relation (5). 

If G(x) has no multiple factors and V» is “discrete,” then k can be taken 
so large that every kth approximant hasf the projection 1. A consequence is: 


THEOREM 4. If for a value Vo of K and an integer k there is only one kth 
stage approximant V;, to G(x), and if proj (Vi, G) =1, then G(x) is irreducible. 


If V;, be replaced by V;.4:, this is essentially a restatement of Theorem 1 
with the additional advantage that the value V introduced arbitrarily there 
is now characterized as an approximant of G(z). 

Irreducibility can also be established by several applications of the degree- 
of-factors theorem. 


THEOREM 5. If Vo and Wo are two given values of a field K, if G(x) is of 
degree Nn=M-N2, where (m, Ne) =1, and if there is an integer k such that every 
kth approximant V;, to G(x) over Vo has deg (Vx) =0 (mod mez), while every 
kth approximant W;, to G(x) over Wo has deg ¢(Wx) =0 (mod m), then G(x) 
is irreducible. 


Proof. By the condition on V, and Theorem 3 each factor f(x) has a de- 
gree which is a sum of multiples of deg ¢(V;) and which is therefore a multiple 
of m2. By the same argument for W, the degree of f(x) is a multiple of m, 
and so this degree must be n, the degree of G(x). 

This theorem can be generalized to apply to s different values with 
N=MNN2 - The conditions on the approximants over can be fulfilled, 
for example, by making the first approximant have the exponent m2, for then 
deg (V2) must be a multiple of m2. 

3. Examples. The theorem of Schénemann, as stated in §1, (1), follows 
from our results, for the condition on ¢(x) is sufficient to make ¢ a key poly- 
nomial for a value 


V2 = [Vop = 1, Vis = 0, V2o(x) 1/e], 


while the condition on M(x) makes the last term in the expansion of f(x) in 
powers of ¢ have the value 1. Hence f(x) is irreducible by Theorem 2. 

In a similar fashion, the various generalizations of the Eisenstein irre- 
ducibility theorem for polynomials with rational coefficients can all be shown 


* It may be observed that the assumption that Vo was discrete was made in §5 only to insure 
that the approximants and limit values give all possible values of K[x], so that this discreteness 
assumption is not needed here. 

t By Const II, Theorem 8.1, which applies whenever G has a non-vanishing discriminant. The 
6; in this theorem are then all0 or 1. 


i 


232 SAUNDERS MacLANE [March 


to depend on the use of absolute values which are the extension of p-adic 
values. This is illustrated in the following list of known theorems which are 
special cases of our theorem applied to particular first and second stage val- 
ues. Unless otherwise indicated the author stated our Theorem 2 for the field 
of rational numbers and for the special value indicated, in most cases not ex- 
plicitly in terms of absolute values but in some equivalent form.* 

Eisenstein: [Vop=m, Vix=1]. 

Schénemann: [Vop=m, Vix=0, Vip=1]. 

K@énigsberger: [Vop=n, Vix=r], r>0. Also Theorem 5 for two such first 

stage values. 

Bauer: [Vop=n, Vix=0, 

Dumas: [Vop=n, Vix=r], Newton polygons, Theorem 3. 

[Vop=n, Vix=r, Vog=s], with restrictions. 

Kiirschak: [Vop=1, Vix=p, Vof=v], Newton polygons, Vif=0. 

Rella: As in Kiirschak, for K a domain of integrity. 

Ore: [Vop=n, Vix=0, Vopb=r], Newton polygons, Theorem 3. 

Ore: [Vop=n, Vix=r], Theorem 1. 

Ore:t [Vop=n, Vix=0, The degree-of-factors 

Theorem 3. 

Irreducibility criteria have also been systematized by Blumberg§ in terms 
of a notion of “rank.” This “rank” is closely related to our “absolute value.” 
It applies also to differential expressions, but does not include the higher 
stage values. 

Our methods for constructing inductive values allow the construction of 

* The papers cited here are, in order: G. Eisenstein, Ueber die Irreduzibilitat und einige andere 
Eigenschaften der Gleichungen, etc., Journal fiir die Mathematik, vol. 39 (1850), p. 166; Th. Schéne- 
mann, Von denjenigen Moduln, welche Potenzen von Primzahlen sind, Journal fiir die Mathematik, 
vol. 32 (1846), pp. 93-105, §61; L. Kénigsberger, Ueber den Eisensteinschen Satz von der Irreduzi- 
bilitét algebraischer Gleichungen, Journal fiir die Mathematik, vol. 115 (1895), pp. 53-78, especially 
(67) on p. 69; M. Bauer, Verallgemeinerung eines Satzes von Schinemann, Journal fiir die Mathematik, 
vol. 128 (1905), pp. 87-89; G. Dumas, Sur quelques cas d’irreductibilité des polynomes a coefficients 
rationnels, Journal de Mathematique, (6), vol. 2 (1906), pp. 191-258; J. Kiirschék, Irreduzible 
Formen, Journal fiir die Mathematik, vol. 152 (1923), pp. 180-191; T. Rella, Ordnungsbestimmungen 
in Integrititsbereichen und Newtonsche Polygone, Journal fiir die Mathematik, vol. 158 (1927), pp. 
33-48; O. Ore, Zur Theorie der Irreduzibilitatskriterien, Mathematische Zeitschrift, vol. 18 (1923), 
pp. 278-288; O. Ore, Zur Theorie der Eisensteinschen Gleichungen, Mathematische Zeitschrift, vol. 20 
(1924), pp. 267-279. 

T This is the first treatment of a non-linear criterion. 

t O. Ore, Zur Theorie der algebraischen Kérper, Acta Mathematica, vol. 44 (1924), pp. 219-314. 
Theorem 4 on page 230 is stated for an algebraic number field as coefficient field, while Theorem 9 
on page 240 gives the degree of any factor in terms of the first two stages, plus the key polynomials 
only on the third stage. Hence this theorem differs in form from our statement. 

§ H. Blumberg, On the factorization of expressions of various types, these Transactions, vol. 17 
(1916), pp. 517-544. 


1938] SCHONEMANN-EISENSTEIN CRITERIA 233 


examples of polynomials which are irreducible in virtue of arbitrary compli- 
cated inductive values not falling under the above cases. For example, the 
value 


[Vop = 4, Vix = 0, V(x? + 1) = 2, Va((a* + 1)? + p) = 5] 
of the ring of polynomials with rational coefficients proves 
f(x) = + 1)? + + p(x? + 1) = 28 + 408 + 1224 + 252% + 25 


irreducible by Theorem 2, although the second stage approximant V2 does 
not show it irreducible. The use of non-homogeneous key polynomials (or of 
constant degree inductive values) is illustrated by 


[Vop = 2, Vix = 0, Vo(x? + 1) = 2, Va(x? + 1+ = 3] (p = 7), 
which, by Theorem 2, proves the irreducibility of 
(x? + 1+ p)? + p? = xt + 16x? + 407. 
A case of Theorem 1 not included in the linear Theorem 2 is 
V2 = [Vop = 1, Vix = 0, Vo(x? + x + 1) = 1] (p = 5), 
f(x) = (x? + + 1)? — p(x? + x + 1) + 3p?x + 39° 
= x4 + 2x3 — 2x? + 72x + 371. 
The residue-class ring of K[x] for this V2 is by Const I, Theorem 12.1 just 
the ring F[6, y], where F is the field of integers, modulo 5, and @ is the residue- 
class of x and so is the root of «*+-x+1 over F, while y is a symbol represent- 


ing the residue-class* of (%?+x+1)/p. To test this f(x) for equivalence-irre- 
ducibility we first multiply it by p-* to make it have the value 0. Then 


p-*f(x) = + + 1)/p]? — [(x* + x + 1)/p] + 3x + 3p, 


so that the residue-class polynomial is 
= — y + 30 


which, although not linear, is irreducible over the Galois field F(@). Hence 
F(x) is equivalence-irreduciblef and therefore irreducible by Theorem 1. 

4. Prime decomposition in algebraic rings. To interpret our irreducibility 
criteria for G(x) we first summarize some arithmetic properties of the corre- 
sponding residue-class ring 


(8) A = K[x]/G(z)). 


* See Const I, §12, (6). 
Const I, Lemma 11.2, where R= 


3) 


234 SAUNDERS MacLANE [March 


Here, and throughout §§4 and 5, K denotes an algebraic number field, G(x) 
is a polynomial in K[x], and © the ring of all integers of the commutative 
algebra A. 


THEOREM 6. In ©, every ideal B which is not a divisor of zero* has a decom- 
position, unique except for the order of factors, as a product of prime ideals 
from 2. For ideals B and C, B not a divisor of zero, the inclusion B <C implies 
the existence of an ideal D witht B=CD. Let 


(9) G(x) = gi(x)*go(x)*- 


be the decomposition of G(x) into distinct irreducible factors g;(x), and denote 
by K; the algebraic field K |x|/(gi(x)), and by ©; the ring of all the integers of K;. 
For each prime ideal P;* D; of the ring ©; there is a corresponding prime ideal 
P’ of D, and the residue-class rings D/P’ and D;/P; are isomorphic. These 
ideals P’ are all distinct and include all the prime ideals of D, except for D itself. 
If p is a prime ideal of the ring of integers of the base field K, the decomposition 
of pin D may be found by decomposing » in each D;, replacing each prime factor 
P; in these decompositions by the corresponding P’ and multiplying the resulting 
decom positions. 


The proof is omitted, since the results are implicit in the more general 
arithmetic of non-commutative algebras.t The theorem can easily be ob- 
tained directly by the usual consideration of A as the direct sum of the 
fields K de 

The degree of a prime ideal P’ in the ring of integers © of Theorem 6 is 
defined to be the degree of its residue-class ring O/P’ over the residue-class 
ring of p, where p is the prime ideal of K such that p-O ¢ P’. This is, by the 
theorem, the same as the degree of the corresponding prime ideal P; of K; 
over p. The relation between the degree of an algebraic number field and the 
degree and exponents of prime ideals yields then the following analogue to 
our irreducibility theorem. 


THEOREM 7. The degree-of-factors theorem. If G(x) has no multiple fac- 
tors and if, in the ring of integers of the algebraic number field K, p is any 
prime ideal with the decom position 


* An ideal B in © is a divisor of zero if every element of B is a divisor of zero. 

t This second property, “every divisor is a factor,” is closely associated with the decomposition 
into prime ideals (van der Waerden, Moderne Algebra, vol. 2, §100). When it holds for all ideals, in- 
cluding divisors of zero, the ring is called a multiplication-ring. See Krull, Idealtheorie, Ergebnisse 
der Mathematik, vol. 5, p. 26. 

tM. Deuring, Algebren, Ergebnisse der Mathematik, vol. 4, p. 108. E. Artin, Zur Arithmetik 
hyperkomplexer Zahlen, Abhandlungen des Mathematischen Seminars, Hamburg, vol. 5 (1928), pp. 
261-289. 


1938] SCHONEMANN-EISENSTEIN CRITERIA 235 


bs 


(10) = P, (b; = exp 


into prime ideals in O, then any factor f(x) of G(x) has a degree of the form 
(11) deg f(x) = P; = Do’(exp Pi)(deg Pi), 


where the sum is to be taken over any subset of the given set of prime ideal fac- 
tors P ée 


THEOREM 8. The irreducibility criterion. If p is a prime ideal from the alge- 
braic number field K and if p has but one prime ideal factor in O, then the poly- 
nomial G(x) is a power of an irreducible polynomial. 


Proof. The decomposition of p in is obtained, as in Theorem 6, by com- 
bining decompositions from all the direct summand fields K;. If the final de- 
composition of p is to have but one factor, there can be only one such direct 
summand and hence only one irreducible factor g;(x) in (9). 


THEOREM 9. If K is an algebraic number field, 5 an integer in K, and G(x) 
a polynomial such that the principal ideal (5) becomes the nth power of an ideal 
in D, then any factor of G(x) in K|x] has a degree r such that (6)* is the nth 
power of some ideal in the ring of integers of K. 


One case of this theorem, in which the decomposition (6) =B* was in- 
sured by specifying the form of G(x) and taking » =deg G(x), was first stated 
by Sopman.* The general Theorem 9 can be established by Sopman’s meth- 
ods, applied to the fields K;, or by the direct use of Theorem 8. 

5. Approximants and prime ideals in algebraic rings. Our two forms of 
the irreducibility criterion are essentially the same, because the approximants 
to any polynomial G(x) can be used to construct the prime ideals in the corre- 
sponding K[x]/(G(x)) =A. For a prime ideal p from the algebraic number 
field K let Vo be the p-adic absolute value of K. Without essential loss of 
generality, we assume throughout this section that G(x) has first coefficient 1 
and its other coefficients Vo-integers. If in (9) we make all factors g;(x) have 
the first coefficient 1, they will also have V>-integers as coefficients. 


THEOREM 10. Given a G(x) and a p-adic value Vo of K, there is for k suffi- 
ciently large a one-to-one correspondence between the kth approximants V;, to G 
over Vo, and the prime ideal factors P’ of pin ©. If ox is the last key polynomial 
of Vi, and P’ is the corresponding prime ideal, then 


(12) deg ¢, = (deg P’)(exp P’). 


*M. Sopman, Ein Kriterium fir Irreduzibilitat ganzer Funktionen in einem beliebigen algebra- 
ischen Kérper, Mathematische Annalen, vol. 91 (1924), pp. 60-61. 


236 SAUNDERS MacLANE [March 


Proof. If for G as in (9) we set G*(x) = g:(x) go(x) - - - g.(x), then, by Lemma 
1, G and G* have the same &th approximants. The “finiteness” Theorem 8.1 
of Const II applies to G*, and gives a k’ so large that, for k=k’, every kth 
approximant V;, to G* has the projection 1. Each V;, is then, by (5), an ap- 
proximant to just one factor g,(x) of G*. But each approximant to the irre- 
ducible g;(x) constructs an extension W of the value V> to the field K; 
(Const II, Theorem 10.2). Each such W corresponds to just one prime ideal 
factor P; of p in ©;, while, by Theorem 6, P; corresponds to just one prime 
ideal factor P’ of p in ©. Combining these successive correspondences we find 
that approximants V, do correspond to prime ideals P’, as asserted. The rela- 
tion (12) follows by Const II, Theorem 9.3. 

With this connection between the approximants and prime ideals, the two 
forms of the two irreducibility criteria become essentially identical. For if 
G(x) has no multiple roots, then in the presence of Theorem 10, the irreduci- 
bility criterion of Theorem 8 in terms of prime ideals is immediately equiva- 
lent to the irreducibility criterion of Theorem 4 in terms of approximants. 
Similarly results hold for the degree-of-factors theorem, if & is large, for then 
(12) reduces (6) to (11). Hence the generalizations of the Eisenstein criterion 
are merely statements about prime ideal decompositions. 

6. Irreducibility of polynomials in several variables. A number of theo- 


rems concerning such polynomials with coefficients in any field F will now 
be derived from the general results of §2. 


THEOREM 11. If $(x) is irreducible over the field F, if g(x, y) is a polyno- 
mial in F |x, y| of degree in y less then n, and if a(x) 40 (mod ¢) and g(x, 0) #0 
(mod ¢), then 


S(x,y) = a(x)y" + g(x, 
is irreducible in F |x, y|, except perhaps for a factor involving only x. 
Proof. By the irreducibility of ¢(x) 
Vo = [VoF = 0, Vox = 0, Voo(x) = 1] 


is a value of the coefficient field F(x), where we have indicated by VoF 
=0 that Voa=0 for all constants a~0 in F. The inductive values V; 
= [Vo,Viy=1/n] is an approximant to f(x, y), and the irreducibility then 
follows by Theorem 2 applied to this V;. By altering the value Viy to m,/n 
we obtain a more general theorem of this sort: 


THEOREM 12. If 6=@(x) is irreducible over F, if 
y) = ao(x)y” + ar(x)p™y"! + + + an(x)p™, 


1938] SCHONEMANN-EISENSTEIN CRITERIA 237 


where the a(x) are polynomials in F[x] with ao(x) 40 (mod ¢), an(x) 40 
(mod ¢), and if the positive integers m; are such that m, is prime to n and 
n-m;2i-m, fori=1,---,n, then f(x, y) is irreducible in F[x, y], except per- 
haps for a factor involving only x. 

A special case of this theorem, for ¢(x) =x—a, was first stated by K6nigs- 
berger.* If F is the field of all complex numbers the conditions of this theorem 
imply that the sheets of the Riemann surface hang together in a single cycle 
at x=a, so that f(x, y) is necessarily irreducible. In the theorem above, this 
single cycle is obtained by the first approximant; that is, by the first step 
of the Puiseux expansion. Similar theorems involving further steps of the 
expansion can be stated, but they are all consequences of our general criteria. 
The theorem corresponding to the point at infinity on the Riemann surface 
runs as follows: 

THEOREM 13. If a;(x) fori=0,1,---, ” are polynomials in F [x] with no 
common factors except constants, if 


y) = ao(x)y" + + + a(x), 
if deg a;(x) =; satisfies the conditions 


(13) (vi — Yo) < — Yo), Yn — Yo f), 


and if Yn—Yo is prime to m, then f(x, y) is irreducible in F |x, y]. 


A related theorem stated by Perron{ replaces (13) by the condition that 
there is an integer 7 such that 


Vi > Yi» < 17; (fori = 1,---,m, but i ¥ 7) 


holds, while a(x) =1, and y; is prime to 7. This theorem is apparently more 
general, but it can be deduced from Theorem 13 by simply interchanging x 
and y. The effect of the change can be visualized by constructing the Newton 
polygon for V». 

Still other types of theorems are possible in case the coefficient field is 
not algebraically closed. 


THEOREM 14. If $(x) is irreducible over F, if 
v(x, y) = 9° + ai(x)y""" a;(x) in F[x], 
is irreduciblet (mod $(x)) in F |x, y], if 


* K6nigsberger, loc. cit., p. 63. 

t O. Perron, Neue Kriterien filr die Irreduzibilitit algebraischer Gleichungen, Journal fiir die 
Mathematik, vol. 132 (1907), p. 304. See also Blumberg, loc. cit., p. 543. 

t In other words if ¥(6, y) is irreducible in F(6)[y], where 6 is a root of ¢(x)=0. 


{ 
i 


SAUNDERS MacLANE [March 


S(x,y) = W(x, + g(x, + r(x, y)o(2), 


where the degrees of g and r in y satisfy deg,r(x, y) <m; deg,g(x, y) <m(e—1), 
and if r(x, y) 40 (mod ¢), then f(x, y) is irreducible in F[x, y]. 


Proof. If we omit the case where m=1 and a,(x) =0 (mod ¢), the condi- 
tions on ¥(x, y) suffice to make y a key polynomial over the value 


Vi = [VoF = 0, Vox = 0, Voo(x) = 1, Viy = 0] 


of F[x, y]. The only second stage approximant to f over Vo is V2 
= [Vi, Vob(x, vy) =1/e]. The given form of f(x, y) shows that the expansion of 
f will satisfy the conditions of Theorem 2 for irreducibility. In the omitted 
case where m=1 and a,,(x) =0 (mod @), the theorem is a direct consequence 
of Theorem 11. 

Other theorems may be obtained by using the non-trivial values of the 
coefficient field F. 


THEOREM 15. If R is the field of rational numbers, if p is a prime, if o(x) 
is a polynomial irreducible (mod p), if 
¥) = 9" + g(x, + h(x, 


where the polynomials g and h have integral coefficients and degrees in y less 
than n, and if either g(x,0) 40 (modd p, ¢(x)) or h(x, 0) 40 (modd 9, ¢(x)) 


holds, then f(x, y) is irreducible in R|x, y]. 
This theorem follows from Theorem 2 applied to the value 
Vi = [Vop = 1, Vox = 0, Voo(x) = 1, Viy = 1/n]. 


If h=0, Theorem 15 is a special case of Theorem 11. Other simple cases of 
Theorem 15 arise for ¢(x) =x. Under the same conditions on ¢, g, and h, it is 
also possible to assert the irreducibility of 


y" + g(x, y)o(x)* + h(x, y)p’, 


provided (e, m) =1=(f, n). 
Theorems on polynomials in several variables may also be stated. For ex- 
ample, the following theorem can be proven for any number of variables. 


THEOREM 16. If F is any field, if 
S(«, ¥, 2) = a(y, 2)x* + B(x, z)y™ + c(x, y)2” 


where a(y, 2), b(x, 2), and c(x, y) are polynomials with coefficients in F, each 
with degrees less than k, m, and n in x, y, and z respectively, while a(0, 0) ¥0, 
b(0, 0) #0, and c(0, 0) +0, and if (k, m) =1, (k, n) =1, (m, nm) =1, then f(x, y, 2) 
is irreducible in F |x, y, z]. 


238 
) 


1938] SCHONEMANN-EISENSTEIN CRITERIA 239 


Shanok* has also stated irreducibility criteria for polynomials in several 
variables in terms of convex polyhedra. Corresponding to each term x*y®p7 in 
the polynomial there is a point with the coordinates (a, 8, y). Each side of 
the convex polyhedron on these points corresponds to a suitable absolute 
value V=[Vp=\, Vx=u, Vy=v], chosen so that three distinct terms in the 
original polynomial have the same minimum value. Thus, although Shanok’s 
theorems are not special cases of ours, they could be stated in terms of abso- 
lute values. 


* C. Shanok, Convex polyhedra and criteria for irreducibility, Duke Mathematical Journal, vol. 2 
(1936), pp. 103-111. 


CorNELL UNIVERSITY, 
Irnaca, N. Y. 


| 
{ 


ON THE GROWTH OF ANALYTIC FUNCTIONS* 


BY 
NORMAN LEVINSON{ 


1. Pélya,t in a restricted case, and Bernstein, § under rather general condi- 
tions, have, to state their results roughly, proved that the rate of growth of an 
analytic function along a line can be determined by its growth along a suitable 
sequence of discrete points on the line. In proving his results Bernstein uses 
certain rather deep theorems from the theory of Dirichlet series and points 
out{ that as yet no proof of the results has been obtained using ordinary func- 
tion theory. 

Here we shall give a simple function-theoretic proof of a set of theorems 
which include those of Bernstein. We shall then refine our methods and obtain 
a set of new theorems which are remarkably precise. (See Theorem VII for 
example.) 

We shall deal exclusively with functions which are analytic in a sector and 
of order 1.|| For use here we can define the Phragmén-Lindelof function for 
a function f(z) analytic in a sector | am z| <a, as 

log | f(re'®) | 
p ’ 


(1.0) h(6) = lim su <a. 


The following theorems are among those which will be proved. 


I.** Let 6(z) be analytic in some sector |am <a. Let defined 
as in (1.0) be its Phragmén-Lindelof function and let h(0) =a. Suppose 


(1.1) h(6) acos@+d| sina, <a. 


Let {zn} be a sequence of complex numbers such that 


n 
(1.2) lim — = D, 


* Presented to the Society, Se»tember 10, 1937; received by the editors March 4, 1937. 

+ National Research Fellow. 

1G. Pélya, Untersuchungen iiber Liicken und Singularititen von Potenzreihen, Mathematische 
Zeitschrift, vol. 29 (1924). 

§ V. Bernstein, Généralisation et Conséquences d’un théoréme de Le Roy-Lindeléf, Bulletin des 
Sciences Mathématiques, vol. 52 (1928); and Séries de Dirichlet, Chapter IX, Paris, 1933. 

| V. Bernstein, Séries de Dirichlet, loc. cit., pp. 229 and 249. 

|| Functions of any other finite order can be transformed to functions of order 1 by w=2?. 

** In the case of real {z,} a proof of this theorem and a related gap theorem were given by the 
author in the Bulletin of the American Mathematical Society, vol. 42 (1936), p. 702. 


240 


GROWTH OF ANALYTIC FUNCTIONS 241 


where D is real, and such that for some d>0 


(1.3) | zn — =| — m|d. 
If 
(1.4) mD> 
then 
] 
Zn r 


In the case where the sequence {z,} is real, Theorem I is due to Bern- 
stein.* We observe that (1.2) requires, if 2,=r,e*, that n/r,—D and that 
|@,| +0 as n—. In other words all except a finite number of the z, will lie 
in any sector containing the real axis. 

A special case of Theorem I is the following: 


THEOREM II. Let $(z) be analytic and of exponential type} in the half-plane 
|am z| <4r. If L is defined by 


(1.6) + h(— = 
and if {z,} is a sequence satisfying (1.2) and (1.3), then 
(1.7) D>L 


implies (1.5). 

It is easy to see by considering sin 7z at the points z, = that (1.7) is criti- 
cal, for in this case D = L =1 and it is clear that (1.5) is not true. Nevertheless 
we shall show that the condition (1.7) can be weakened considerably without 
disrupting the theorem. It is in weakening this condition that we obtain a 
new and very precise set of theorems. 

The simplest of these theorems is 


THEOREM VII. Let $(z) be analytic and of exponential type in the sector 
|am <4. Let 


d(iy) = 001), | @. 
Let {zn} be a sequence of densityt D=0, such that |am z,|—0 as n> and 


* Loc. cit., Chapter IX. Likewise Theorem II stated below is due to Bernstein in the case where 
the {z,} are real. 

t A function f(z) is of exponential type in a sector, |am z| Sa, if f(z)=O(e'*'), |am 2| Sa, for 
some constant C. 

t In case the density of {z,} is greater than zero, Theorem II gives Theorem VII at once. Thus 
this theorem is really of interest when D=0 in which case Theorem II cannot be applied. 


| 
| 
| 


242 NORMAN LEVINSON [March 


|Zn—Zm| 2|m—m|d. A necessary and sufficient condition that (1.5) hold is that 


1 
(1.8) 


Clearly Theorem VII is a much sharper result than that obtained by try- 
ing to apply Theorem II. For example, if z,= log (1+), Theorem VII is 
applicable while Theorem II is not. 

The condition ¢(¢y) =O(1) in Theorem VII can easily be replaced by 

log* | | 


Analogous results are true in Theorem VI which follows and in related the- 


dy <@, 


orems. 
Another of these theorems is 


THEOREM VI. Let $(z) be analytic and of exponential type in the half-plane 
|am z| <4r. Let 
o(iy) = 
and let {d,} be an increasing sequence of positive numbers satisfying 


n 
(1.9) lim—=D, %1—-m2d>0. 


no 


Let A(u) be the number of \, <u. If 


A(u) — uL 
and if* 
(1.11) A(u) > Lu +C 


for some C, then 


cup _ 
im sup —————— = lim su 


no n 


log | (x) | 
. 
x 


Theorem VI goes much further than Theorem II in that it does not ex- 
clude the possibility of D=L. 

We shall first give simple function-theoretic proofs of Theorems I, II, and 
related results which are due to Bernstein in case {z,} is real. In §3 we 


* This condition is by no means critical. It can for example be replaced by A(u)>Lu—u"—C, 
a<1. However condition (1.10) is the divergence condition (1.8) of Theorem VII, and, as in that 
theorem, it is easy to show that it is a best possible result. 


— 

| 

H 


1938] GROWTH OF ANALYTIC. FUNCTIONS 243 


will turn to the proofs of our new results such as Theorem VI, VII, and re- 


lated theorems. 
2. Here we concern ourselves with Theorems I and II and certain of their 


extensions. 

The following result of Phragmén and Lindeléf will be of basic impor- 
tance:* 

THEOREM A. Let f(z) be an analytic function of z=re**, regular in the region 
R between two straight lines making an angle 1/a at the origin, and on the lines 
themselves. Suppose |f(z)| <M on the lines and that as r> , f(z) =O(e"*) uni- 
formly in R for some B <a. Then f(z) <M throughout D. 

We also require 


Lemma 1. Let {zn} satisfy (1.2) and (1.3). If 


(2.0) F@) = II(1- =), 


1 n 


then along the line am 2 =0 


] F(re® 
(2.1) 6 #0, 7. 
Tr 
Also 
] F 
(2.2) lim ap = 0. 
| x| 
Moreover 
1 
and for any e>0 
(2.4 = 


F’ (zn) 


For real {z,} all these results are well known.f (2.3) can be made much 
more precise but suffices for our purposes. The proof of this lemma is quite 
straightforward. Since it is very much the same as for real {z,} we omit it. 


* See, e.g., Titchmarsh, Theory of functions, Oxford, 1932, p. 177. 
+ F. Carlson, Uber Potenzreihen mit endlich vielen verschiedenen Koeffizienten, Mathematische 
Annalen, vol. 79 (1919), pp. 237-245, especially pp. 239-240. 


| 


244 NORMAN LEVINSON [March 


Proof of Theorem I. We observe that there is no loss of generality in tak- 
ing a <}r. Clearly if (1.5) does not hold there exists a c such that 


] l 
(2.5) lim sup og | (x) | 
x 


ne | | 
It follows from (2.4) and (2.5) that 
(2.6) g(z) = e“F(z) 
is an entire function. Since g(z,) =(z,) it follows that 
$(2) — g(z) 
F(z) 
is analytic for |am z| <a@. Using (1.1), (2.3), and (2.6), it follows that for 
|z—z,| 24d 
(2.8) ¥(z) = 


But ¥(z) is analytic and (2.8) being true on the circles |z—z,| =}d must be 

true inside, since a function analytic in a domain takes its maximum value on 

the boundary of the domain. Therefore (2.8) holds in the entire sector. 
From (1.1), (2.1), (2.6), and (2.7) 


¥(reti#) = O(exp [r(a cosa + bsina — rDsina + + exp [cr cosa]), > 0. 


(2.7) ¥(z) = 


Or setting —b) tan we have 

(2.9) w(reti*) = O(exp [r cos a(a — y + € sec a)| + exp [cr cos a]). 

Since 7D >b, y>0. If we take ¢€<}y7 cos a, then (2.9) becomes 
= O(exp [prcosa]), p = max (a — }y, 


In other words ¥/(z)e-”* is bounded for am z= +a. But by Theorem A, this 
and (2.8) implies that it is bounded in the entire sector (—a, a). Thus in par- 
ticular 


(2.10) ¥(x) = O(e?*), p = max (a — }y, ¢). 


But by (2.7), o(x) =y(x)F(x)+g(x). When we use (2.2), (2.6), and (2.10), 
this gives 


lim su 


log | | 
x 


< max (a — 37, ¢). 


This contradicts the assumption that #(0) =a and proves Theorem I. 


| 


1938] GROWTH OF ANALYTIC FUNCTIONS 245 


Proof of Theorem II. There is no loss in generality in assuming that 
=h(—37) 
Let us set 


_ log | ¢(z)| 
lim sup ——————- = a 
x 


Then clearly when we apply Theorem A to $(z)exp[(7Z+6)iz—(a+e)z], 
e€>0O, in the upper right quadrant, it follows at once from Theorem A that 
here 


h(6) acos@+ sin @|. 
Similarly this holds in the lower right quadrant. Theorem IT now follows at 
once from Theorem I. 


THEOREM III. Let ®(z) be an analytic function in the right half-plane 
|am such that 


(2.11) (re) = O(exp [(a log r cos rb | sin | + )r]), | Sir, 


where a=0, b= —4a, and « is an arbitrary positive quantity. If {z,} is a se- 
quence satisfying (1.2) and (1.3) and if 


(2.12) D>b+ 3a, 
then 
(2.13) lim ay < 
implies 
(2.14) ®(re*) = O(exp [xr(p cos 6 + b| sin @| + )]). 

Proof. We shall assume that a>0, for if a=0 we have Theorem II. Let 
(2.15) 

T(1 + az) 


From Stirling’s formula it follows easily that for |6| <a and large r 

log | T(1 + are‘) | — ar log r cos @ + sin + arcos@ — }logr = O(1). 
Thus from (2.11) 
(2.16) (re) = O(exp [(rb| sin@| + a@sind + acos@+e)r]), |6| < 4x, 
and from (2.13) 
(2.17) $(Zn) = O(exp [(— a log | z,| cos 0, + rp +a+6)| 
Using (2.12), (2.16), and (2.17) in Theorem II, we see that 


: 
| 
4 
; 
; 
' 


246 NORMAN LEVINSON 


] 
r 


As in the proof of Theorem I we define 
= (s,)e4~ 
(2.19) 


where here A is any real number. We also consider ¥/(z) = {¢(z) —g(z)}/F(z). 
As in Theorem I, ¥/(z) is an analytic function satisfying (2.8) for |am z| <37 
Along the imaginary axis (assuming tz, >1 as we may with no restriction) 


o(i = 
(iy) ) 
F(iy) 1 | F’(zn) 
By (2.1), (2.12), and (2.16), |¢(iy)/F(éy)| is bounded. Thus there exists some 
M,>0, which is entirely independent of A, such that 


(2.20) | | < Mi + > 


Along am z=}7i we have, using (2.1), (2.18), and (2.19), 
V(re*/4) = O(exp [— 2Ar] + exp [— Ar cos }r]). 


| ¥(iy) | < max 


Using this and (2.20) in Theorem A, we see that ¥(z)e4: is bounded in the 
sectors Sam and —}m<am 2S}r, or in the entire right half-plane. 
Again from Theorem A, (2.20) now implies that 


< hr. 
F’(2,) |am z| < 


In particular then 


(2.21) | (re™*/4) |< < exp [— Ar cos ( 


In the following we use M2, M;, etc., to represent positive constants inde- 
pendent of A. Using (2.19) and (2.1), we have 


(2.22) exp Ar cos tx + | |. 

1 F’ (Zn) 

Since ¢(z) =g(z)+y(z)F(z) we have, using (2.1), (2.21), and (2.22), 
) 


| | < exp [— Ar cosim + Ms(1 + > |—— 
11 F’(2n) 


[March 
m+ 


1938] GROWTH OF ANALYTIC FUNCTIONS 


Or setting A =a log r, we have 


| | 
(2.23) 


n 
< exp [— arlogrcos}4 + #Dr|M; (1 >> o(z og r] ). 
1 Zn 


If we use (2.4) and (2.17), we have 


| exp [azn log r] 
F' (Zn) 
1 


r 


But exp[—)dz log (u/r)] has for its maximum value as u varies e*. Thus if 
B>1 is so large that log B>2(rp+a+e)/a, then 


| exp log r] 
(Zn) 


= o(« exp [(rp + a + + ar/e] 


+ > exp [—|z,| (2cosd, — 1)(rp + 


leql>Br 
= O(exp [Br(2a + rp + 6)]). 
Using this in (2.23) gives 
= O(exp [— ar log r cos 34 + Cr]), 
where C=7D+B(2a+7p+€). Or in (2.15) 
@(ret/4) = O(e°"). 


But this and (2.11) for @= +42, used in Theorem A, show that ®(z) is of ex- 
ponential type in the right half-plane. Theorem III now follows at once from 
Theorem IT. 


THEOREM IV. If in Theorem III, (2.13) is replaced by 
(2.24) ®(z,) = O(| exp [— log | zn|]| ), k>0O, 
then (2.14) is replaced by 
(2.25) (re) = O(exp [(— klogrcosé + rb | sin 6 | +)r]), | S 


Proof. This proof is identical with that of Theorem III except that in 
(2.17) we take account of (2.24) and replace a by a+. This modification 
will then cause a corresponding change in (2.23) where a is again replaced by 
a+k since now we set A = (a+) log r. This finally gives us 


| 
‘ 
q 
i 
) 


248 NORMAN LEVINSON [March 


@(re*/4) = O(exp [— kr log r cos 44 + Cr]). 
From this we see that ®(z)I'(1+4z) is of exponential type and therefore 
Theorem II can be applied to obtain Theorem IV. 
THEOREM V. If (z) satisfies the requirements of Theorem III with (2.13) 
replaced by (2.24) and 
(2.26) k > 2b, 
then ®(z) =0. 


In proving this and subsequent results of this type we use a fundamental 
theorem* of Carleman, or rather a consequence of this theorem. 
Let f(z) be analytic in the half-plane |am z| <4 and let R>1. Then 
1 


raat G a) og | f(iy)f(— iy) | dy 


(2.27) 


/2 
log | f(Re*)| cos@d0+A>0, 


wR 
where A is some number depending only on f(z). 


Proof. Applying (2.27) to @(z). we have, assuming it is not identically 
zero, 


mJ, Gee) y y) | dy 


1 
+ logt | | cos dé. 
mR 


Using (2.25) and replacing A by another constant A,, we get 
— A, < (6+ 6) log R — $k log R, 


or 2(b+¢)2=k. But by (2.26) this is impossible for arbitrarily small e. Thus 
=0. 

3. In this section we consider Theorem VI and related theorems. We re- 
call that Theorem VI may be valid even when D=L. The proof is quite 
different from those of the previous section. 

The method of this section can best be presented by first using it to give 
an alternative proof of Theorem II. 

Alternative proof of Theorem II. There is no restriction in assuming that 


(3.0) h(gr) = h(— = aL. 


* E. C. Titchmarsh, The Theory of Functions, Oxford, 1932, p. 130. 


| 


1938] GROWTH OF ANALYTIC FUNCTIONS 249 


Moreover it is clear that (1.5) follows if we prove that 


log | ¢(zn) | 


(3.1) lim su <0 
no 
implies that h(0) <0. 
As in (2.6) 


(3.2) ss) = 


for any e>0, is an entire function of exponential type. And as in (2.7) 


(3.3) = 
is analytic for |am z| <}7. As in (2.8) 
(3.4) W(2) = Ofexp log |z|]), < 4x. 
Since $(z) and g(z) are of exponential type in |am z| <47, (2.1) and (3.3) give 
(3.5) = 
for some B>0. From (3.3) we have 
Since D>L, (3.0) and (2.1) imply that 
(3.7) $(iy)/F(iy) = O(exp [— — L)| y|]). 
From (3.6) and (3.7) 
1 
(3.8) His) = 


Using (3.5) and (3.8), ze-?24)(z) is bounded along the imaginary axis and the 
line am z=}7. By Theorem A this and (3.4) implies that it is bounded in the 
entire right half-plane. Thus. 


282 1 
“tee 


When we use this, it follows at once from the Cauchy integral theorem that 
for x>0, 


| 
) 


250 NORMAN LEVINSON [March 


1+2 Qrid_in +s 


1 io e72Bs 
= as f dy 
1 + Ss 0 


—1t00 


—2Bs 


Or if 
3.10 H = e“td 
then for x >0, 

3.11 H —uzdy, 
(3.11) (u)e~“*du 


When we use (3.9) and close the path of integration to the right in (3.10), 
it is clear that 


(3.12) H(u)=0, 
On the other hand, when we use (3.3) in (3.10), it follows that 


1 (it)e~2*Bt 


H(u ) iut 

F(it)(1 + it) 
(3.13) 

(Zn) | ico e (u—2B+e) 

- > ds 

1 F'(tn) 2wid (1 + — Zn) 
Or for u<2B—e, 
(3.14) H(u) _ (it)e 23% (4-28) 


J + it) + Sn) 


Clearly by (3.1), (2.4), and (1.2) the infinite series on the right of (3.14) 
represents an analytic function for u<2B. Again by (3.7) it follows that the 
infinite integral on the right of (3.14) represents an analytic function in u 
for — «© <u<o. Since the sum of two analytic functions is analytic it follows 
that H(u) is analytic for 1u<2B—e. But by (3.12), H(u) =0, u<0. Therefore 
H(u) =0, u<2B—e. Using this in (3.11), we have 


—2Br 
-f e~“*H(u)du. 


| 


1938] GROWTH OF ANALYTIC FUNCTIONS 


By (3.9) and (3.10), H(u) is bounded. Thus 
1+ x 
or ¥(x) =O(e). If we recall that =g(x)+F(x)y(x), it follows that 
] 
log 
x 


= 


h(0) = lim sup 
Since ¢ is arbitrary, (0) <0. This completes the proof. 

The difference between this proof and those of §2 is that here we get a 
representation of ¥(z) in terms of H(u). In this section we are attempting to 
refine Theorem II so that D>L, (1.7), is not necessary. Let us see how such 
a change would affect the argument in the preceding theorem. It is clear 
that the crucial point in this argument is the paragraph following (3.14) and 
it is only with this that we need concern ourselves here. 

It is convenient to write (3.14) as 


(3.15) H(u) = Hi(u) — Hi(u), u<2B—e, 
where 
(3.16) Hi(u) = FGA + 
and 


zn(u—2B) 


It is clear that H2(u) is analytic for u <2B irrespective of how D compares 
with L. Therefore it is only with H,(u) that we need be concerned in changing 
(1.7). If D=L, then there need exist no 6>0 such that 


(it)/F(it) = O(e*!*!) 


and H,(u) need no longer be analytic. 

Is there any weaker condition than analyticity on H(z) that tells us that 
if H,(u) =H2(u), u<0, and H2(u) is analytic for u<a, then H,(u) =H2(u), 
u <a? That there are such weaker conditions is shown by the following result. 


THEOREM B. If 
(3.18) =f 


where G(t)eL(— ©, ©) and if for t>0 


251 
| (3.17) H.(u) = 3 | 
| 
j 


252 NORMAN LEVINSON 
(3.19) G(t) = O(e*™), 


where 6(t) is a monotone non-decreasing function such that 


(3.20) =o, 


then if {(u) coincides with an analytic function over some interval it coincides 
with the analytic function over its entire interval of analyticity on the u axis.* 


In order to use this theorem with H,(u) for f(u) it must be shown that 
¢(it)/F(it) satisfies a condition of the type (3.19) under the hypothesis of 
Theorem VI. This is done by 


Lemma 2. If 


(3.21) F(s) = =.) 


1 Ae 


and if {Xn} satisfies the requirements of Theorem VI, then there exists a non- 
decreasing function 6(y), y>O, such that for sufficiently large | y| 


(3.22) etLlul /F(iy) = O(| y , 


and such that 
6 
(3.23) f 
1 
Proof. We assume that A; 21 (since we can discard any X,’s which are less 
than one). Clearly 


log | F(iy)| = f (1 +2) 


2 
0 y+ 


y? 


* In On a class of non-vanishing functions, Proceedings of the London Mathematical Society, 
vol. 41 (1936), p. 393, it is shown that if f(u) vanishes over any interval it vanishes identically. Theo- 
rem B is closely related to this result. The proof of Theorem B will appear shortly in the Journal 
for Mathematics and Physics. 


we have 


Since 


GROWTH OF ANALYTIC FUNCTIONS 


y? 
du 
y? + u2 


1938] 


log = 2 f 
u 
= du f (A(v) — Lv + du 
log u 
icf 2 + du 


Lo +0) + © — 2 log | 


IV 


dv — 2C log | y| + constant. t 
9 


If we set 


vA(v) Lo +C 
6(y) -f et dv, y>1, 
1 


v 


then by (1.11), 0(y) is non-decreasing. Also 


A(v) —-Lv+C 
-f dv 
1 


y2 


This proves the lemma. 

Proof of Theorem VI. Here, as in the alternative proof of Theorem II, 
it suffices to show that lim sup log|¢(A,)|/An <0 implies 4(0) <0. The proof 
proceeds almost exactly like that of the alternative proof of Theorem II up 
to (3.14) except that we concern ourselves with e~?#4f(z)/(1+-z)?¢+? here. So 
in place of (3.14) we get 


1 ico —2iBy Xn 
| 
2m J _ io F(iy)(1 + iy)2e+? F’(An)(1 + An) 
u<2B-—e. 


As before the infinite series on the right is analytic for u<2B and H(u) =0, 
u <0, or by (3.15), Hi(u) =H2(u), «<0. By Lemma 2 
1 io 


H 
m= F(iy)(1 + iy)2e+? 


| 253 
| | 
= 


254 NORMAN LEVINSON [March 


satisfies the requirements of Theorem B. Since 


(An) ern (4-2B) 


u<0O, 


Theorem B implies that this is true for ~<2B—e. That is H(u)=0 for 
u<2B-—e. As in the alternative proof of Theorem II, this leads at once to 
v(x) =O(e). h(0) <0 follows at once completing the proof. 

There are variations of Theorem VI such as the following: 


THEorEM VI-A. Theorem VI remains true if (1.10) is replaced by 
o(iy) = Olexp y| — 


where 0(y), y>0, is a non-decreasing function of y which satisfies (3.20). 

The proof of this result is quite obvious from what precedes. 

Before proving Theorem VII the following result very much like Lemma 2 
is necessary. 

Lemma 3. Let F(z) be defined as in (2.0) with {2,} satisfying the require- 
ments of Theorem VII. Then 


1 
= , 
F(iy) 


where 0(y), y>0, is a non-decreasing function of y satisfying (3.20). 
If z, =rne**, there is no loss of generality in assuming that | 6,| <i. We 
then have, if A(w) is the number of |z,| <u, 


2y? 4 
log | F(iy)| = = DX log (1 + 26 +2) 
1 A, x 
4 1 4 
re” tog (1 ++) -—f dA(u) log (1 + 2) 
2 1 ny 2 0 u* 
log 2 
2 


> 


1 
an(u) > — a(| 


If we take =A(|y]), then | F(iy)|-=O(exp[—30(| y| )]). To prove (3.20) 
we have 


1938] GROWTH OF ANALYTIC FUNCTIONS 


and therefore (3.20) is satisfied. 

Proof of Theorem VII. To show that (1.8) is sufficient we proceed exactly, 
as in the alternative proof of Theorem II up to (3.14). We then use Lemma 3 
to apply Theorem B to H;(u), (3.16), and show that H(u)=0, u<2B-—e. 
From this the fact that (3.1) implies 4(0) <0 follows at once. 

We now turn to the necessity of condition (1.8). Let us assume that 


By (1.8) 


Then we shall show that no such result as (1.5) holds. Let 
— Sn)(z — Zn) 

3) = 
II + 2n)(z + Zn) 


Then ¢(z) is analytic in the right half-plane. Moreover 


| o(rei#) | = 


re® —z, | | re® — 


———|<1, |6| 
+z, +3, 


Therefore ¢(z) satisfies the requirements of Theorem VII. Moreover 


log | | 
lim sup = 


| Zn | 
If (1.5) could be applied it would give 


lim sup = — 

x 
Applying Theorem A to e“$(z), C>0, in the upper and lower right quad- 
rants we see that it is bounded. Then again applying Theorem A to the 
bounded function e“¢(z) in the right half-plane, we see that for |am z| <}7, 
| e°*@(z)| <1. Since C can be made arbitrarily large this means that ¢(z) =0, 
which obviously is not the case. Thus (1.8) is a necessary condition in order 
that Theorem VII be true. 

The analogue of Theorem III in this section is the following: 


THEOREM VIII. Let (z) be an analytic function in the right half-plane 
|am z| <42 such that for any «>0 


&(re®) = O(exp [(a log r cos + € cos @ + b| sin )r]), | | 


255 
dA(y) 
= 
y 
be 1 
—— < 
1 | zn | q 
j i 
| 
i 
oO, 
| 


256 NORMAN LEVINSON [March 


where a=0, b= —4a, and let {dq} be a positive increasing sequence satisfying 
(1.9). Let A(u) be the number of d, <u. If 


A(u) — (b + 4a)u 


u? 


if for some C, A(u) >(b+3a)u—C, and 
log | ®(A,) | < 


(3.24) lim su wp, 
then 
(3.25) (rei) = O(exp [xr(p cos 6 + b| sin @| + € cos @)]), |@| < 


where € is an arbitrary positive quantity. 

Proof. As in Theorem III we consider ¢(z) =®(z)/I'(1+az). By Theo- 
rem VI it follows that (2.18) holds. g(z) is defined as in (2.19) and y(z) 
= {¢(z) —g(z)}/F(2). 

In formulas analogous to (2.20), (2.21), and so on, we consider 
¥(z)/(1+2)¢ rather than just ¥(z) as in Theorem III. Otherwise the proof 
now proceeds in precisely the same way as in Theorem III. 


THEOREM IX. If in Theorem VIII (3.24) is replaced by 
#(r,) = O(exp [— log |Anl]), &>O0, 
then (3.25) is replaced by 
(re) = O(exp [(— k logr cos@ + €cos@ + rb | sin )r]). 
Proof. This theorem is related to Theorem VIII in the same way as 
Theorem IV is related to Theorem III and its proof follows almost at once 


from that of Theorem VIII just as that of Theorem IV follows almost at 
once from Theorem III. 


THEOREM X. Let ®(z) satisfy the requirements of Theorem VIII with (3.24) 
replaced by 


lo #(r,) | + 2dr, log An 
(3.26) lim sup | =— 


Mn 


Then ®(z) =0. 
Proof. Applying Theorem IX to e#'@(z) for any B>0, we have 


(3.27) | (z)| Bi(exp [(— 2b logrcos@ — Bcos@ + sin@| )r]), 
| ir, 


1938] GROWTH OF ANALYTIC FUNCTIONS 257 


where B, is a constant depending on B. Applying (2.27) to ®(z) and using 
(3.27) we have for some A, depending only on ®(z) (if ®(z) does not vanish 
identically), 

1 


2 
logt | (Re) | cos 6 


wR 


A =) (2xby)d + 


Again using (3.27) 


cos? 6 dé + b cos sin 6| 


2b log R+ B oft 


+ f cos 6 dé. 
mR 


2 
—Ai — 4B+ + — log Bi. 
wR 


Letting R->~ we see that by choosing B >2A:+2b we obtain a contradiction. 


PRINCETON UNIVERSITY 
AND 
INSTITUTE FOR ADVANCED STUDY, 
PRINCETON, N. J. 


Or 

q 

i 

) 

| 

| | 

| 

| | 

} 

[ 


ON DIFFERENTIAL GEOMETRY IN THE LARGE, I 
(MINKOWSKI’S PROBLEM)* 


BY 
HANS LEWY 


Introduction. Hermann Minkowski,} in a fundamental paper on convex 
bodies, proposed the following Problem (M): to determine a convex, three- 
dimensional body B whose surface admits of a given Gaussian curvature 
K(n)>0, assigned as a continuous function of the direction of the interior 
normal to the surface. After having stated three obviously necessary condi- 
tions for the function K(m), Minkowski proceeds to solve an analogous Prob- 
lem (M’) for polyhedra. He then considers a passage to the limit among the 
solutions of problems (M’) approximating (M), and establishes their con- 
vergence to a convex body By. This construction leaves open the question as 
to whether By is a solution of (M). Minkowski remarks that B, if it exists, is, 
to within a translation, the uniquely determined solution of a certain third 
Problem (M’’) of the calculus of variations and that Bo, too, is a solution 
of (M”’). 

Now if we assume (H): the surface of By is differentiable to a suffi- 
ciently high order, then By solves (M). However, Minkowski does not discuss 
(H), but proves instead that the mixed volume V(Bo, Bo, C) of Bo with an 
arbitrary convex body C may be computed as though By were a solution of 
(M) and that, furthermore, a convex body is, except for a translation, 
uniquely determined by its mixed volume with the totality of convex bodies. 
While later authors have modified Minkowski’s methods, there has been no 
improvement of his results as far as the hypothesis (H) is concerned. 

Thus Minkowski’s results are open to the same criticism that could be 
raised against the early solutions of Plateau’s problem: namely, that instead 
of solving the proposed problem, a more general problem is treated whose 
solution coincides with that of the former only if the latter solution satisfies 
certain highly restrictive conditions, and no indication is presented that these 
conditions are actually satisfied. 

The present paper contains a solution of (M) for the case of analytic 
K(n). It does not involve the Brunn-Minkowski inequalities, nor, indeed, the 
idea of mixed volume. It uses instead the author’s results on elliptic and 

* Presented to the Society, November 28, 1936; received by the editors March 1, 1937. 

t Minkowski, Werke, pp. 231-276, entitled Volumen und Oberfliche. See especially §10. Also 


Bonnesen-Fenchel, Theorie der konvexen Kérper, Ergebnisse, vol. 3, no. 1 (1934), which contains a 
bibliography of related papers. 


258 


| 
| 
i 


DIFFERENTIAL GEOMETRY IN THE LARGE 259 


analytic equations of the Monge-Ampére type and establishes the existence 
of an analytic B by a continuity method. 

In order to make this treatment of (M) complete, a new proof of the 
uniqueness of B is included. This proof is based on a modification of a beauti- 
ful idea of Cohn-Vossen* who, in a similar problem, reduced the uniqueness 
problem to the determination of a certain topological index of a vector field. 
Our modification is such as to allow an application of this idea to a wider 
class of related uniqueness problems in the large. 

1. Let u(x, y) be a homogeneous polynomial solution of Laplace’s equa- 
tion of degree n >2. A suitable rotation of the (x, y)-system transforms u(x, y) 
into a constant multiple of R[(x+zy)"], and the asymptotic directions of the 
surface u=u(x, y), determined by 


(1) + 2u,,dxdy + uy,dy? = 0, 


undergo the same rotation as the coordinates. Assuming this rotation effected, 
we obtain for the asymptotic directions 


Ri (x + iy)"-*(d(x + iy))?] = 0, 


or, in polar coordinates r, 6, 
+ idy)?] = 0. 


Thus the vector dx+idy of the asymptotic direction forms the angle 
—(n—2)0/2 or —(n—2)0/2+7/2 with the x-axis; the two asymptotic direc- 
tions are perpendicular, and either direction turns through an angle —(m — 2) 
as we follow it along a Jordan curve containing the origin in its interior. 
In other words: the asymptotic directions form two distinct fields of direc- 
tions with the origin as a singular point of index —(m—2)/2. We notice fur- 
thermore that the discriminant of (1) is a constant multiple of r?"~* and van- 
ishes only for r=0. 
2. We prove now the following lemma: 


Lema. Let F(x, y, u, p, 9, 7, 5, t) be analytic in the neighborhood of (xo, Yo, 
to, Po, Yo, Yo, So, and 4(0F /dr) (OF /dt) — (OF /ds)? >0. Let u(x, y) and its first 
derivatives p, q and second derivatives r, s, t be a solution of F =0, analytic in a 
neighborhood of (xo, yo) and such that u(xo, yo) =tUo, p(X0, Yo) = Po, , Yo) 
=to. Assume u'(x, y) and its derivatives p’, q’, r’, s’, t’ to be a second analytic 
solution of F =0, coinciding together with its derivatives p', q', r’, s', t’', with u 
and its derivatives at the point (xo, yo). Then the difference U=u—u’' represents 
a surface U(x, y) whose Gaussian curvature is negative in a sufficiently small 


* Cohn-Vossen, Zwei Sdtze siber die Starrheit der Eiflichen, Gottinger Nachrichten, 1927, pp. 
125-134. 


4 
i 
i 
} 
3 
4 
| 
i 
i 
i 


260 HANS LEWY {March 


neighborhood of (xo, yo), with the exception of (x0, yo) itself, and the index of 
either of its distinct asymptotic directions is negative at (xo, yo), unless u(x, y) 
is identically equal to u’(x, y). 


In terms of the second derivatives R, S, T of U(x, y) the asympototic 
directions of U = U(x, y) are given by 


(2) Q = Rdx? + 2Sdxdy + Tdy* = 0; 


and (2) admits of real solutions dx, dy if RT —S? <0. To show that RT —S?<0 
express F(x, u(x, y), p(x, y), )-F(z, u’ (x, y), (x, y); ) as a 
power series in (x—29, y—yo) and observe that the terms of lowest degree 
are given by 


E=aR+bS+cT, 


where R, 5, 7 are the second derivatives of the non-vanishing terms U 
of lowest degree m>2 in the development of U, and a, b, c are the values 
of OF /dr, OF /ds, AF /dt at (xo, yo), for which by hypothesis 4ab—c?>0. Since 
F(---, u(x, y),---)—-F(---, u’(x, y),---)=0, we have E=0, and this 
is possible only for RT —5?<0 or R=S=T =0. Now R, S, T vanish simul- 
taneously only at (xo, yo). For there exists a suitable linear transformation of 
the (x, y)-plane with determinant 1 which leaves (xo, yo) invariant and trans- 
forms E=0 into Laplace’s equation. U is thereby transformed into a har- 
monic homogeneous polynomial of degree >2 in (x—2o, y— yo), and for such 
polynomials we proved in §1 that the discriminant of the second derivatives 
vanishes only at (0, 0). Hence the development of RT —S? starts with the 
negative term R7 —S? and we conclude that RT —S? is negative for suffi- 
ciently small |y—yo| and vanishes only at x=%0, y=Yo. 

Since the linear transformation does not change the index of a field of 
directions, we conclude from §1 that the index of the field 


(3) QO = Rdx? + 2Sdxdy + Tdy? = 0 


is negative. Since, for sufficiently small values of |*—x0|, | y—~yo|, the di- 
rections of the field (2) differ arbitrarily little from those of (3), the index of 
(3) is negative at (xo, yo). This completes the proof of the Lemma. 

3. We may now prove the following: 

THEOREM 1. Two closed convex analytic surfaces S and S’ are congruent if 


they possess the same positive Gaussian curvature K at points for which their 
inner normals are parallel and similarly directed. 


By parallel normals we map S and S’ on the unit sphere a. An arbitrary 
equator divides o and, since the map is one-to-one, S and S’, into two regions 


| 


1938] DIFFERENTIAL GEOMETRY IN THE LARGE 261 


in each of which the sphere as well as S and S’ assume the form Z=Z(X, Y) 
in suitably chosen rectangular coordinates. Upon introducing 


ox 

as independent variables instead of (X, Y) and setting 

(S) H(x, y) = — Z(x, y) + xX(x, y) + y¥(x, 9), 

we obtain 

(6) X=H., Y=H, 

How S and S’ satisfy the condition that their curvatures for corresponding 


parallel normals, i.e., for the same value of (x, y), are the same positive func- 
tion K(x, y), whence S and S’ are solutions of 


ax ay? \axay 


a(x, y) 
0(X, VY) 


(1 + x? + y?)?K(x, y) = 
or, finally, of 
(7) — Hy = K-\(x, + + 
It is readily shown that the second fundamental form of the surface (6) is 
(8) (Hz2dx? + 2H,,dxdy + Hyydy®)(1 + 2? + 
Suppose that the formulas (4) and (5) lead to a function H(x, y) if applied 


to S, and to H’(x, y) if applied to S’. Then the statement of Theorem 1 is 
equivalent to the equation 


H'(x, y) = H(x, y) + U(x, 9), 


where /(x, y) is a linear function of (x, y). 

Consider, with Cohn-Vossen, the congruence points of S and S’, i.e., those 
points for which their normals and their second fundamental forms coincide. 
If all of their points were congruence points, we should have 


Haz = Azz, @ Hy = Hy, 
and the theorem is proved. In the alternative case we shall show the existence 


of at least one congruence point. The two second differential forms of S and S’ 
may both be assumed to be positive definite. The equation 


(9) — H.2)dx? + 2(H zy — H2,)dxdy + (Hy, — Hiy)dy? = 0 


| 
az az 
or 
| 
fi 


262 HANS LEWY [March 


obtained by setting the two forms equal to one another, determines two dis- 
tinct directions tangential to o at each point of o which is not a congruence 
point. For the ellipses of the (dx, dy)-plane 


H,,dx* + 2H ,,dxdy + Hy,dy? = 1 
and 
+ 2H',,dxdy + Hj,dy? = 1 


have, by (7), the same area 7(1+2?+y?)K/*(x, y). As they are concentric 
but not identical, they intersect in four distinct points; the ratios of their co- 
ordinates are the two distinct solutions dx: dy of (9). 

It is impossible to construct a field of tangential directions on the sphere 
without singularities, and the sum of the indices of these singularities equals 2 
if there are only finitely many singularities. Now choose, at an arbitrary point 
of o, one of the two directions (9) and extend, by continuity, this choice over 
the whole of a. If there were no congruence points, we should obtain a field 
of tangential directions on o without singularities. Hence there is at least one 
congruence point (xo, Vo). 

Subtracting if necessary a linear function from H’(x, y) we may assume 
that, at (xo, yo), H(x, y) and H’(x, y) coincide with their derivatives up to the 
second order, without affecting the truth of (7) nor that of (9). But now our 
lemma implies that unless H and H’ are identical, the congruence point 
(xo, Vo) is isolated and has a negative index. Summing over all indices of all 
singularities we still obtain a negative number in contradiction to the general 
fact mentioned above. Hence S and S’ are congruent. 


THEOREM 1’. Consider a sequence of closed convex analytic surfaces S(r) of 
positive curvature such that their corresponding functions H(x, y, 7) depend 
analytically on (x, y, r) for small values of +. Suppose that all surfaces S(r) 
have a common point of contact corresponding to the same point (x, y) of o. 
Assume that for each point of S(O) the derivative (0K /0r),.0 vanishes. Then we 
have also (8H /dr),.0 =9. 


Abbreviate the operator (0/dr),.o by the use of the symbol 6 and con- 
sider the field of tangential directions dx, dy on a, determined by 


(9’) 5H + 26H ,dxdy + 6H,,dy? = 0. 


Unless 5H..= 5H. = 5H,, =0, there are, at each point (x, y), two real and dis- 
tinct directions satisfying (9’). For we deduce from (7) by differentiation 


(7’) + yy 2H 25H zy = 0, 
which yields 6H,.5H,,—(5H.,)*<0 in view of HzzHyy,—H?2,>0. 


| 
j 


1938] DIFFERENTIAL GEOMETRY IN THE LARGE 263 


As in the preceding proof we conclude first the existence of a point P ong 
for which 6H,,= 6H,,=6H.,=0. Assume this not to be true identically. 

Since an addition of a linear function of (x, y) to 6H does not affect the 
truth of (7’) and (9’), we may assume that the field of directions determined 
by (9’) in the neighborhood of P corresponds to a function 6H that vanishes 
at P together with its derivatives up to the second order. This function 6H 
may be considered as the difference between the following two solutions of 
(7’): 6H itself and the identically vanishing solution. From the lemma we see 
that P is an isolated singularity of the field defined by (9’) and that the index 
of P is negative. The proof of Theorem 1’ can now easily be completed by 
recalling the conclusion at the end of the proof of Theorem 1. 

4. We state now the following: 


DEFINITION. Let dw be the surface element of the unit sphere o and let &, n, 
be the cosines of the angle of its normal with the axes of the (X, Y, Z)-system. A 
function F of a point of o is called admissible if it depends analytically on the 
point of o and if 


(10) ff Fed = Pade = f = 0. 


THEOREM 2. For every admissible positive function K on o there exists a 


closed convex analytic surface S whose curvature, considered as function of the 
interior normal vector (of length 1), equals K. 


First we shall demonstrate the following statement: 


II. Assume that for small values of a parameter r an admissible positive func- 
tion K depends analytically on the point of o and r, and that for r =0 there exists 
a surface S(O) with K(0) as corresponding curvature function. Then there exists 
an analytic closed convex surface S(r) with K(r) as corresponding curvature pro- 
vided |r| is sufficiently small. 

Let M be an arbitrary but analytic function on the unit sphere o. Con- 
tinue its definition from the sphere into the three-dimensional space contain- 
ing it by assuming MM to be zero at the center and linear on every ray issuing 
from the center. M thereby becomes a homogeneous function of degree 1 in 
any system of rectangular coordinates (&, n, ¢) with the origin as center and M 
is analytic everywhere except at the origin. We have from the homogeneity 


(11) M = MéE+ 
Introduce 
(12) cH(x, y) = — M. 


| 

| 


264 


We obtain 
(13) H.=M:, Hy=M,. 
On o the coordinates (x, y) may be used for either hemisphere {<0 or ¢>0. 


From (13) we conclude that H,, H, are analytic on the whole of o if we define 
them on the equator ¢ =0 by continuity. 


We find 
= ) 
(1 x? y2) 1/2 


dxdy 
(1+ x? + 
Whenever M is such that 0(H., H,)/0(x, y)#0, the reasoning of p. 260 
shows that 


(14) 
dw 


Hy) 
(15) (1 + + = 2) 
d(x, y) 
is the reciprocal Gaussian curvature of the surface 
(16) X = M:, Y = M,, Z=M;, 


whence we conclude that ¢ remains invariant under the change which 2, y, 
and H undergo as &, n, ¢ are subjected to an orthogonal transformation that is 
sufficiently near the identity. This fact is obviously independent of the condi- 
tion 0(H., H,)/d(x, y) #0. 

The integral over an arbitrary region 


can be transformed into the integral over its boundary 


1 
f (H.dH, + HydH,). 


As the expressions H,, H, are analytic on the unit sphere we can extend the 
integration in $ over the whole of ¢ and obtain §=0 since the boundary in- 
tegrals over the equator, generated by the integration over the upper and 
lower hemispheres, annul each other. 

Using the invariance of ¢(H, H), we obtain similarly 


a7 made = ff = f = 0. 


We also find for the associated bilinear form 


HANS LEWY [March 
| 


1938] DIFFERENTIAL GEOMETRY IN THE LARGE 


¢(Ai, H2) = (1 + + + 
that 


ff on, = oh, Hondo = ff = 0. 


In other words, ¢(H:, H:) is admissible if M, and M; are analytic on the 
sphere. 

After these preliminary remarks we return to the given function K-(r) 
and develop it into a power series in 7, 
(19) = «(r) = ko + + 
We try to determine a function M(é, n, £; 7), homogeneous of first degree in 
(20) M(r) = Mo +. Mitr + Mor? +---, 
depending analytically on &, n, ¢, 7 (excepting the point £=7={=0) and be- 
ing in the following relation to K-"(r) : If we introduce by (12) the quantities 
x, y, H(x, y; 7), then (7) holds for all sufficiently small values of |r|. Let us 
develop the equation (7) 

H) =k 


into a power series in 7 and set the coefficients of both members equal. De- 
noting again by the operator 6 the differentiation with respect to r at r=0, 
we find the equations: 


(21) = 6"H(x, y)/(1 + + y*)"?, Mo = H(x, y)/(1 + + 
H.dHy + AybH 22 — 2H = y)(1 + + 


(22) Hed*Ayy + 22 — 2H cy = 2'ko(x, y)(1 + + y*)-? 
26H yy = 6H 25H zy), 


Since by hypothesis K-(r) is admissible for all 7 in question, the same is 
true for the coefficients: 


Suppose that we have found 6"M(E, n, ¢) for »<m, in accordance with 
(22), and analytic on the sphere. Then the preliminary remarks show that 
the right-hand member in the mth equation yields zero if we multiply it in 
turn by 


i 


HANS LEWY [March 


(1+ a%+ (1+ y%)%do, (1+ + 


and integrate over the whole sphere. We therefore are entitled to apply a 
theorem of Hilbert* in which he states the now well known alternative for 
elliptic differential equations on the sphere for the special case of the equation 
with admissible right-hand member r, 


(1 x? + y?)?(H (6H) yy + Hy,(6H) zz 2H zy(6H) zy| 
6H-(1 + x? + y?)-1/? = 6M, 


and establishes the existence of a solution 5M, which is differentiable infi- 
nitely many times as the coefficients of the differential equation are. Applying 
Hilbert’s result to the mth equation (22) we find the function 6"M. Since this 
equation is elliptic because S(0) is convex and accordingly H..Hy, —H:, >0, 
we conclude from the analyticity of the equation the analyticity of 6"M on 
the sphere. 
It remains to be seen that for sufficiently small |r| the series 
M=M.+)>, 
1 n! 

converges. This can be true only if we eliminate the arbitrariness that affects 
the determination of 6"M, since we may add to 6"M an arbitrary linear com- 
bination of £, », ¢ with constant coefficients and still retain a solution of the 
differential equation for 6"M. Denote by L[v] the linear elliptic differential 
expression in v on the sphere which in coordinates (x, y) reduces to 


L(v) = (1 + 2? + + — zy), v(1 + x? + 


Since L[v] =0 has precisely the three linearly independent solutions v= &, 7, ¢, 
L|v] admits of a Green’s function of the second kind G(A; B), where A and B 
are two points of the sphere. With the aid of one such G(A; B) we solve the 
equations (22), written in the abbreviated form 


by setting 


(23) = f f G(A; B)f*(B)dws, 


and thereby disposing of the arbitrariness in the determination of 6M. 
We shall say “a function f on the sphere satisfies a Hélder condition of ex- 


* D. Hilbert, Grundzilge einer Aligemeinen Theorie der Linearen Integralgleichungen, Leipzig and 
Berlin, 1912, p. 250. 


266 
= 1,2,---), 


1938] DIFFERENTIAL GEOMETRY IN THE LARGE 267 


ponent a and coefficient C” if for arbitrary distinct points Q, and Q; of spheri- 
cal distance Q,02>0 we have 


| f@2)| CHO#, 


where a is a positive constant less than 1. Similarly “the derivatives of f 
satisfy a Hélder condition of exponent a and coefficient C” if 0<a<1 and 
for arbitrary 


| f(Qx) — fQ2)| COOF, falQr) — fa(Qx)| 


where f, is the derivative in the direction of the great circle joining Q; to Qe 
and f, is the derivative in the direction normal to this circle. Similarly for 
higher derivatives. 

With these notations the familiar estimates of the theory of linear elliptic 
differential equations are applied. to the integral v(A) = /fG(A; B)f(B)dws 
and lead to the following: 


Lemma. [f f is bounded by C and satisfies a Hélder condition of exponent a 
and coefficient C, then »(A) = [[G(A; B)f(B)dws and its first and second deriva- 
tives are bounded by hC and satisfy a Holder condition of exponent a and coeffi- 
cient Ch where h does not depend on f. 


From the convergence of the series (19), with the use of the Heine-Borel 
theorem, we derive the existence of a number j>0 such that 


1 1 
Ky —) 
< < 

p” 


where x,’ stands for the directional derivative of x, at an arbitrary point 
in an arbitrary direction. Hence there also exists a number p>0O such that 

k,| <1/p” and x, satisfies a Hélder condition of a certain exponent a and co- 
efficient 1/p’. 

The first equation (22) shows that 6M and its first and second derivatives 
are bounded by h/p and satisfy a Hélder condition of exponent a and coeffi- 
cient h/p. 

Consider the equation 


(24) = 


It admits of two roots for 2(7) and in particular the root 


Car® 


1 


n!p” 


| 
1 
64h? 


268 HANS LEWY 


We have the following system of recurrent formulas from (24): 


22(0)2(0) = — 
p 


22(0)2”(0) = — — 22'2(0) 
p 


3! 4h-3! 
22(0)2’"(0) = — 62’(0)2’’(0) + 24herce, 
p 


3 


On the other hand consider the successive bounds and Hélder coefficients C, 
for 6°M and its first and second derivatives as obtained from (22). We have 
evidently the same law by which to form the successive inequalities* 

4h 


Gas—»> 
p 


2! 
4h-— + 8hC?, 
p? 


3! 
p 


As all terms involved are positive we conclude C,<c,, (v=1, 2, - - - ). Hence 
the series 


6"M 
T™ 
n! 


= M00) 


converges together with its first and second derivatives with respect to &, n, ¢ 
uniformly for sufficiently small |7| since 2(7) is a majorant series. In order to 
complete the proof of statement II we have to show that M(r) depends 
analytically on &, 7, ¢ for all sufficiently small |r| and (€, 7, ¢) (0, 0, 0). 

Denoting by M,(r) the mth partial sum of M(r), by x,(7) that of x(7), 
and by H,(x, y; 7) the function M,(r)(1+?+~?)"/?, we have for arbitrary 
e>0 and uniformly for all points of the sphere 


| = | any Hizy)(1 + + y?)? kn(r) | <€, 


provided x is large enough. For A, is a polynomial whose term of lowest de- 
gree in 7 is of degree m; if we replace everywhere in A, the 6M, 6M, - - - and 


* Observe the invariance of $(5H, 5H). 


[March 
4h 
or qa=—-») 
p 
4h-2! 
or =—+ 8hkcr, 
p? 
| 
i 


1938] DIFFERENTIAL GEOMETRY IN THE LARGE 269 


their first and second derivatives with respect to (x, y) by their upper bounds 
C, and form the sum of the absolute values of all terms thus obtained, we ob- 
tain less than the mth remainder in the similarly formed majorant of 


(H — Hzy(r))(1 + + y*)*, where H(r) = M(r)-(1+ + 
and this majorant converges since it is a polynomial of convergent power se- 
ries in 7. 

Thus lim,..H,(x, y; 7) =H(x, y; 7) is a solution of the analytic elliptic 
equation 


(H 22(7)Hyy(r) — Hzy(r))(1 + «x? + y*) = K-\(r) 


and its second derivatives satisfy a Hélder condition; hence, by a well known 
theorem,* H(x, y; 7) is analytic in x and y for every closed bounded region 
of the (x, y)-plane. This result may, of course, be formulated invariantly by 
stating that M(r) depends analytically on the point of the sphere for small 
values of |r]. 

It will be observed that the proof of statement II which is thereby com- 
pleted follows precisely the routine way of solving a functional equation in 
the neighborhood of a value 7» of a parameter 7 entering the equation, if it 
can be solved for 7=79. Our theorem on Monge-Ampére equationsf is, how- 
ever, the essential tool which makes it possible to derive our present Theo- 
rem 2 from the statement II. 

Returning to the hypotheses of Theorem 2 we form, with the given posi- 
tive admissible distribution of reciprocal curvature on a, the family of posi- 
tive admissible distributions 


K-\(r) =1—7+7K", 


which for =0 reduces to the reciprocal curvature of itself and for =1 to 
that of the surface to be determined. Let r’ be the greatest value of 7, 
0 <r’ <1, such that for every positive there exists an analytic surface S(z) 
of curvature K(r) with r’—e<r<r’. We shall show that 7’ =1. First of all, 
by II, r’>0. Since for all values of 7 in 0 <7 <1 the curvature K(r) is bounded 
from below by a fixed positive number, a theorem of Bonnett shows that all 
existing S(r) have a diameter which is bounded from above. Now take an 
arbitrary normal of ¢ and introduce coordinates (£, 7, ¢) such that its inter- 


* E. Hopf, Uber den funktionalen, insbesondere den analytischen Charakter der Lésungen elliptischer 
Differentialgleichungen zweiter Ordnung, Mathematische Zeitschrift, vol. 34 (1931), pp. 194-233. 

{ Hans Lewy, A priori limitations for solutions of elliptic Monge-Ampére equations, II, these 
Transactions, vol. 41 (1937), pp. 365-374, especially Theorem 2’ on p. 374. 

t Cf. Blaschke, Differentialgeometrie, vol. 1, Berlin, 1924, p. 150. 


i 


270 HANS LEWY 


section with o becomes (0, 0, 1). Our introduction of the (x, y)-system will 
give this point the coordinates (0, 0). Let H(x, y;7) (1+?+y?)—/? be the dis- 
tance of the tangent plane of S(r) from a fixed point for which we take the 
center of gravity of S(r). Then we have | H(x, v, 7)| <28 where B is the upper 
bound of the diameter of S(r) and (x, y) is restricted to the circle x?+y?<1. 
Apply our theorem on Monge-Ampére equations to a sequence of solutions 
H(x, y, r) of (7) for which the parameter 7 tends to 7’. We obtain a subse- 
quence converging to an analytic solution H(x, y, 7’) of (7) with 7=7’. Since 
the origin of the (x, y)-system corresponds to an arbitrary normal of oa, the 
Heine-Borel Lemma shows the existence of a closed analytic surface S(r’) of 
curvature K(r’). Now S(r’) may be made the starting point for the construc- 
tion of S(r) for infinitely many values of 7, greater than and close to 7’, 
with the aid of II. Thus the assumption that 7’ be less than 1 and at the same 
time the greatest value in every neighborhood of which there are smaller 
values of 7 admitting a surface S(r), has led to a contradiction. Hence r’=1 
and S(r’) =S(1) exists. 


UNIVERSITY OF CALIFORNIA 
BERKELEY CALIF. 


| 


ON THE SERIES FOR THE PARTITION FUNCTION* 


BY 
D. H. LEHMER 


1. Introduction. In 1917 Hardy and Ramanujanf gave the following as- 
ymptotic formula for the number /(m) of partitions of n, 


1 [ant/2] d 
1.1 = A — O(n-1/4) , 
(1.1) =( —) + (n-1/4) 


where 
= An = (m — 1/24)1/2, 


and the coefficients A ;(m) are defined by 


A,(n)=1,  Ao(m) =(—1)", = 2 cos — 1)/18] 
and in general 


(1.2) Ax(n) = 
(p) 
where p ranges over those: numbers which are less than & and prime to k. 
Here w,,, are certain 24kth roots of unity which arise in the theory of modular 
functions and are defined by (1.4) and (1.5). 
Without knowledge of the behavior of A,(m) for large values of k other 
than the obvious fact that 


(1.3) A,(n) = O(k) 


Hardy and Ramanujan were unable to decide several questions about (1.1). 
For instance, if a is given, (1.1) gives p(m) to within half a unit for all suffi- 
ciently large m. Just how large m must be was not discovered. Whether (1.1) 
would converge if extended to infinity and what is the least number of terms 
that need be taken were other questions depending on the magnitude of 
A,(n). 

Quite recently Rademacher{ has shown that if in (1.1) we replace e? by 
2 sinh x we obtain an infinite series for p(n) (with a= ©) whose convergence 
follows easily from (1.3). This striking result enables one to estimate the 


* Presented to the Society, March 27, 1937; received by the editors March 11, 1937. 
1 Proceedings of the London Mathematical Society, (2), vol. 17, pp. 75-115. 
t Proceedings of the London Mathematical Society, (2), vol. 43, pp. 241-254. 


271 


} 
i 
i 
| 
j 


272 D. H. LEHMER [March 


difference between p(m) and the first N terms of the series of Hardy and 
Ramanujan. This estimate, of course, depends on A;(m) so that information 
about the general behavior of A,(m) for large as well as small values of & is 
important in this connection. 

Apart from these questions there is the problem of actually using (1.1) 
to determine isolated values of p(m) for m large. The task of evaluating Ax(m) 
from its definition is quite formidable when & is large. Hardy and Ramanujan 
gave A,(m) for k<18 as sums of cosines, while the actual values of A;(m) for 
k<20 and all m have been tabulated recently.* The apparent intricacy of 
A,(n) would seem definitely to restrict the usefulness of (1.1). 

It would therefore seem desirable to make an intensive study of A,(m). In 
a recent papert we have proved that the series (1.1) would diverge if extended 
to infinity. This result was obtained from a simple estimate of A,(m), where 
k is a square of a prime. In this paper we give formulas of A;(m) as a single 
term thus eliminating the necessity of any sort of tables of A;(). This result 
enables us to give close estimates for A,(m), and to answer the questions 
mentioned above, and makes feasible the application of (1.1) to any number 
of terms. 

The method employed in the present paper depends in part on showing 
that A,(m) may be transformed into “generalized Kloosterman sums.” These 
in turn may be evaluated by a slight extension of the results of Salié.[ The 
results of this paper were first obtained independently of Kloosterman sums. 
Considerable space is saved, however, by referring to results already pub- 
lished. Section 2 is devoted to multiplication theorems for A ,(m) which reduce 
the evaluation of A,(m) to the case in which & is a prime or a power of a 
prime.§ In §3 these evaluations are carried out. The final section applies the 
results of the preceding sections to the Hardy-Ramanujan and Rade- 
macher series. 

The quantities w,,, appearing in the definition (1.2) of Ax(m) are given by 


(1.4) (=*) exp [- («-—) xi] 


if k is odd, and by 


* Journal of the London Mathematical Society, vol. 11 (1936), pp. 117-118. Erratum: for Azo(”) 
read Ago(m+5). 

ft Journal of the London Mathematical Society, vol. 12, pp. 171-176. 

¢ Mathematische Zeitschrift, vol. 34 (1931), pp. 91-109. 

§ This multiplicative property, discovered empirically, could have been anticipated eight years 
ago from a result of Estermann: Hamburger Abhandlungen, vol. 7 (1929), p. 91. 


1938] THE PARTITION FUNCTION 273 


(1.5) (—) exp | - («-—) ni | 


when & is even. Here (a/b) is the symbol of Jacobi and pp=1 (mod &). 

If we substitute exp [ {1—(a/b) } (xi/2) ] for the symbol (a/b) in (1.4) and 
(1.5), we obtain after a few simple reductions the following expressions for 
A,(n): 


(1.6) A,(n) = exp (k odd), 


(1.7) = Dexp ES (bev), 


where p ranges over a complete system of residues prime to k and where the 
functions f and g are given by 


(1.8) fale) =falo, = — 


—k 
The number 1 —24n plays the dominant role in what follows and is abbrevi- 
ated by writing 
(1.10) y=1— 24n. 


From (1.6) and (1.7) it is seen that f(p) and g(p) need only be determined 
modulo 24k. The following fundamental congruences are used many times 
in the sequel and are set forth here for reference. If k is odd 


(1.11) fn(e) = vp + p (mod k or 3k)* 
according as 3 is prime to & or not. 
(1.12) fal) = 0 (mod 3) 


if k is prime to 3. 
(1.13) fn(p) = 2k (<4) + k — 3 (mod 8). 


If k=2>k, where is odd, 
(1.14) gn(p) = vp + (mod or 


* In these congruences f stands for 1/p (mod M) where M is the modulus of the congruence. 


274 D. H. LEHMER [March 


according as k; is prime to 3 or not. If ; is prime to 3 


(1.15) gn(p) = 0 (mod 3). 

For every 

(1.16) col) = +3 + 2% (— “) + + 3)p (mod 
p 


2. Multiplication theorems. We shall derive three theorems for expressing 
A,(n) as a product of two A’s whose subscripts are coprime integers whose 
product is k. This enables us to evaluate A ;(m) for all k when the values of A, 
are known, where g runs over all powers of primes. 


THEOREM 1. If k; and ke are odd coprime integers, then 
(2. 1) Ax,(m)Ax,(m2) Ax 


where —(k? +k? —1)/24 (mod 

Remark. In case f; or k; is a multiple of 3, the numerator of the fraction 
(k? +k? —1)/24 is also a multiple of 3 and the fraction becomes of the form 
M/8. In any case, then, the quantity ; may be replaced by an integer 


modulo 
Proof. We consider first the product 


(2.2) Ax,(m)Ax,(m2) = exp] { + be) } 
12kyke 


(Pi) (ps) 
where p; and p2 range over the numbers less than and prime to &; and ; re- 
spectively. For each value of these summation indices we define p3; by the 
system of congruences 
(2.3) 
(2.4) p3 


It is clear that as p; and pz range over their respective values the numbers ps; 
modulo &,k2 run over the numbers <f,k2 and prime to kk: so that 


pi/ke (mod ki), 
p2/k; (mod ke). 


We show that every term of (2.2) is equal to the corresponding term of (2.5), 
where the correspondence is determined by (2.3) and (2.4) and m; is defined 
by (2.1). This amounts to showing that 


* See footnote on p. 273. 


1938] THE PARTITION FUNCTION ; 275 


(2.6) = fn,(os, kike) — { + ke) } = (mod 


In the first place if neither k; nor kz is divisible by 3, then it follows from 
(1.12) that 


D,; = 0 (mod 3). 


On the other hand we may suppose from symmetry that f, is divisible by 3 
if not both &; and & are prime to 3. Therefore let k = 3k: or ke according as 3 
does or does not divide k,k2. Our task is then to show that 


D, = 0 (mod 8k). 
We consider first the modulus A. By (1.11) we have 
D, = vps + ps — ke(vipi + fi) (mod 
where 
(2.7) vg = 1 — = + Reve. 
Hence in view of (2.3) we have 
+ ps = kevip: + kop: (mod ki), 
so that 
= 0 (mod &;). 

If k=ke, the same argument shows that D,=0 (mod &). In case k=3k, 
we note that f,,(p1, #1) =0 (mod 3) so that 

Dy = fn,(es, Rike) — kifn,(p2, Re) = vss + Bs — ki(vep2 + (mod 


but from (2.7) vs=k? v2. (mod k), so that 


= ki(Rips po) ) (mod k). 


1P2P3 


By (2.4) the second factor is a multiple of k, while the third factor is a multi- 
ple of 3 since (mod 3), and =p? (mod ke) so that kip293;=1 (mod 3). 


Hence 
= 0 (mod 


There remains to show that D,=0 (mod 8). By (1.13) we have 


— 


dD, — ( 


1*%2 


| + ¢ | + sta} (mod 8), 
ky ko 


| 
(2.8) 


276 D. H. LEHMER 


while, by (2.3) and (2.4), 


We now separate two cases according as k; and are both of the form 4x—1 
or not. In the affirmative case we have 


— Ps Pl p2 
= —(—)(—], and =1 4), 
(=) 


so that (2.8) becomes 


In case not both &; and are of the form 4%—1 we have from (2.9) 


G2) Ge) 


— 4hike + (ki + 3)(ke + 3) + 4 


— A(kike — 1) + (ki + 3)(ke + 3) = (hi + 3) (he + 3) 
= 0 (mod 8), 

since at least one factor is a multiple of 4. 

This completes the proof of Theorem 1. 

THEOREM 2. Let k be odd and d be an integer >1, then 

= (— 

where 
(2.10) ns = + 2m, — (k2 — 1+ 2%)/24 (mod 298). 


[March 

and 

— 

2} 

| 


1938] THE PARTITION FUNCTION 


Proof. Since 
(— 1)” * = exp (3-2*kwi/12-2k), 
we have to show that the product 
(2.11) = exp [{2*fn,(01, 2) + 2%) }wi/12- 2k] 


(1) (P2) 
is equal to 


(2.12) (— = exp [{gn,(0s, 2%%) + ri/12-2k]. 


In fact we show that, provided p; is related to p; and pz by means of the system 
(2.13) ps = pi/2* (mod &), 
(2.14) ps = p2/k (mod 2°), 


then the corresponding terms of (2.11) and (2.12) are equal. To this effect 
we consider the difference 


Dz = gn,(ps, 2k) + 3-2%% — [2*fn,(01, &) + (02, 


and prove that it is divisible by 24-2*%. 

Consider first the modulus 3 if & is not divisible by 3. Then by (1.12) and 
(1.15) each term of D, is a multiple of 3. Next consider D, modulo k or 3k ac- 
cording as 3 is prime to & or not. Referring to (1.11) and (1.14) we find that 


D2 = + p3 — 2(vipi + (mod k or 3k). 
But v3=1—24n;=2” — 24-27, =2*», (mod & or 3k), so that 


) (mod & or 3). 


1 
2*pips 


The second factor is a multiple of k by (2.13) and, in case 3 divides k, the 
third factor is also a multiple of 3. Hence 


Dz = 0 (mod 32). 
We must now show that D.=0 (mod 2+). Using (1.13) and (1.16) we 
have 


— Dk 
Dz = v3p3 + 


) + 2%p3 + 3-2kps + ps + 3-2%k 


=") ~PRe+3-2 


+ 2 ( 
p2 


) k + 2”kp2 + + (mod 2)+*). 


277 
| 


278 D. H. LEHMER 


But since v3;=k?v,+2” (mod 2'+*) we obtain on collecting terms 


1 
Dz = (kp; p2) + — —) + 2D3k(p2 ps) + 3-2°k 
P2ps 


Consider for the moment the Jacobi symbols. We will show that the quantity 


We consider separately the cases \ even and J odd. 
If \ is even, we have by (2.14), p2=p3k (mod 4), and 


By (2.13) 


(2.15) 


()- GAG) - OG) 


— (— 1) (mod 4). 


If \ is odd, we have A\>2, and p2=p;k (mod 8) so that 


Gra 


Hence, in this case, 


)+ - OG) 


— (— (mod 4). 


[March 
|| 
k k k 
Hence 
= 
and 
C98) | 
k 
= 


1938] THE PARTITION FUNCTION 279 
With this result we return to (2.15). We note first that on account of (2.14) we 
have 

(2.16) = 2h, 

so that the factor 


1 
(a + 2k — 
P2ps. 


of (2.15) may be considered modulo 8 only. Since ve=1 (mod 8) we find that 
by (2.16) the first term of (2.15) is congruent to 


p3h*- 2? (mod 2)+%), 
Hence the first four terms of (2.15) become congruent to 
2[p3(h? + 1) + + 1)] — 3-2kp3(k — 1) 

= — 3-2kp;(k — 1) = 3-2%3(k — 1) (mod 
since the quantity in square brackets is even and 2A+12A+3. Hence we 
have 

Dz = 2[3pa(k — 1) — (k — 3) + 20] 
= 2[3psk — — k +3 — 2k-(— (mod 24), 
But the quantity in brackets is divisible by 8. In fact if both & and p; are of 
the form 4% —1 it becomes 
(3p3 + 1)(k — 1) + 4 = 0 (mod 8). 
In the opposite case we have 
3(k — 1)(p3 — 1) = 0 (mod 8). 


Hence in both cases Dz is divisible by 2*+*. This completes the proof of the 
theorem. 


THEOREM 3. Let k be an odd integer, then 
(2.17) A;(n) = + (k? — 1)/8). 
Proof. Since 


(2.18) Ax(m) = exp Ee 2k) 
4k 


while 


(2.19) A;,(n) = exp | k) | 


| 
| | 


280 D. H. LEHMER [March 


where for brevity we take m, for 4n+(k?—1)/8, it suffices to show that a cor- 
respondence may be set up between p; and pz so that corresponding terms of 
(2.18) and (2.19) are equal. We show that the correspondence is simply 


(2.20) pi = p2/2 (mod k). 
To this effect we define D; by 
(2.21) Ds = 2fn(p2, &) — 2k) 


and show that D;=0 (mod 482). 

We first show that D; is divisible by 3k. In case 3 is prime to k, the fact 
that D;=0 (mod &) is an immediate consequence of (1.12) and (1.15). In case 
3 divides k we use (1.11) and (1.14). In either case 


D3; = 2vpe + 2pe2 — V1~pi1 — pi (mod k or 3k). 


But 
(2.22) v, = 1— 24m, = 1 — 96n — 3(k® — 1) = 4(1 — 24m) = 4v (mod 32). 


So that 


D; = (p2 — 291) (2 ~ (mod or 3k). 
By (2.20) the first factor is divisible by k while in case 3|k, the second factor 
contains 3 by (2.20), since y=1 (mod 3). Hence D;=0 (mod 32). 
We must now show that D; is divisible by 16. Using (1.13) and (1.16) in 
(2.21) we find 


P2 


k 


) + 2k 6 — 


Pi 


In view of (2.20) and (2.22) we may substitute for p2, and »; and obtain 


k 


—1 
+ + (mod 16). 


Considering the quantity inside the braces modulo 4 and replacing all odd 
factors of even numbers by unity we obtain 


2p1 2k 2 
D3; = ) ( )+ (- 1) 1— (— + 
k Pi 


THE PARTITION FUNCTION 


) ( =) (— 
k P1 


+ (= 


This proves Theorem 3. 
Another form of Theorem 3, which exhibits it as a multiplication theorem, 


is 

THEOREM 4. If k is odd, then 
(2.23) As(m)Ax(m2) = 
where 

Nz = 4ne + kn, + (k? — 1)/8 (mod 22). 
This follows easily from the following 
Lemma 1. If k and n are arbitrary integers 
= — An(n + k). 
Proof. This is an immediate consequence of applying the identity 


e72npri/2k = — e72(ntk)pri/2k 


where p is odd, to the definition (1.2) of Ax(m). 
To prove Theorem 4 we need only to observe that if we apply Lemma 1 
n, times with n=4n.+(k?—1)/8, we obtain from Theorem 3 


— 
8 


Ag (4ns + + 


) = (- 1)":A;.(m2) == Ao(m)Ax(me2). 


Another form of Theorem 3 states that if k is odd, 


This follows readily from Theorem 4. 

Theorems 1, 2, and 4 may be used to express A;(m) in terms of A’s whose 
subscripts are powers of primes dividing k. This is illustrated in the following 
examples. 


1938] 281 
=4 
| )- Gah 


282 D. H. LEHMER [March 
Example I. Express A;5(23) in terms of A; and A;. By (2.1), 3=23=25m- 
+49n, —73/24 (mod 35). This gives two congruences 
49n, = 0 (mod 5) 


and 
25n2 = 3 (mod 7). 


Hence m,=0 (mod 5) and m2=6 (mod 7). Therefore 
A35(23) = A;(0)A 7(6) 
Example II. Express Ay(17) in terms of As, A3, and As. By Theorem 4 


with k=15 we have 
m3 = 17 = 4n_ + 15m, + 28 (mod 30), 


whence 
Ne, = 1 (mod 15), 
nN, 1 (mod 2), 
so that 


A30(17) = A2(1)A 15(1). 


Applying Theorem 1 to As(1) as in Example I, we find 


A 15(1) =A 3(2)As(2) ’ 


whence 
A 3o(17) = A2(1)A3(2)A5(2). 
Example III. Express A3e(13) in terms of A, and A». By Theorem 2, 
Aye(13) = —Ao(m)Aa(m2), where 13=812+16n,—4, so that 


mn, = 5 (mod 9), 
nN, = 1 (mod 4) 
and we have 


3. The evaluation of A,(m). From the results of the preceding section it 
is clear that questions concerning the actual value of A;(m) or merely the 
order of magnitude of A,(m) may be reduced to the corresponding questions 
about A,(m), where qg is a power of a prime. Three cases present themselves 
quite naturally, namely those in which (I) g= p+, where p is a prime >3 and 
a21, (II) g is a power of 3, (III) g is a power of 2. In all cases the number 
A,(n) may be expressed in terms of generalized Kloosterman sums of the type 


| | 
= — A4(1)A9(5). 


THE PARTITION FUNCTION 


(s) 
where x is a quadratic character, and ss=1 (mod gq). 

The problem of evaluating these sums has been solved in case q is a prime 
by Salié. In case g is a power of an odd prime the sums may be easily evalu- 
ated if use is made of Salié’s discussion (§§2, 3) of the original Kloosterman 
sum. In fact the introduction of the character has no influence on the argu- 
ment until the very last stage where it results in changing some of the cosines 
to sines or vice versa. We give therefore without further comment the follow- 
ing lemmas. 


Lemoa* 2. Let g=p*, where p is an odd prime. Then if s runs over the num- 
bers less than and prime to q, the sum 


(0 if ais a non-residue of q prime to q, 


410 
(=) cos —— if a = 6 (mod q), a prime to q, 
> =< q 


\Q 


0 if a is divisible by p anda > 1, 
is divisible by p and a = 1. 


Lemma 3. If g=3", a>1 and if a=1 (mod 3), then 


3e-1 3 q 


(s) 
where (mod q). 


To apply these lemmas to the evaluation of A,(m) where q is odd we sepa- 
rate two cases. 

Case I. g=*, where p is a prime >3. Returning to the definition (1.8) 
of fn(p), we have from congruences (1.12) and (1.13) 


fn(e) = 0 (mod 3) 
and 


fulo) = 3+ 24(—*) (moa 8). 
If g=1 (mod 4), g—3 = —2q(2/q) (mod 8). Hence f,(p) is an even or odd mul- 
tiple of 12 according as (2p/g) = +1, or —1. If (29/g) =+1 we may write 
= 24t(p) (mod 24g), 
while, if (2o/q¢) = —1, 


* Cf. Salié, loc. cit., equations (54) and (57), p. 102, in case g is a prime. For g a power of a prime 
compare (32) and (33), p. 97. 


1938] 283 
| 


D. H. LEHMER 


= 24t(p) + 12g (mod 24g), 


where, in both cases, 
(3.1) t = t(p) = fnle)/24 = (vp + p)/24 (mod g) 


by (1.11). Hence for a given p we get a term of (1.6) of the form e****/¢ or 
e?ritlaeri, so that if g=1 (mod 4), 


2 
(p) q 


where # is defined by (3.1). 
If g=—1 (mod 4), f,(o) =0 (mod 3) and 


2p 
fale) = — 64() (mod 8). 
Hence, in this case, 
2p 
= 24t — 69 (mod 244), 
q 


where ¢ is given by (3:1) and the typical term of (1.6) is, in this case, 


e2titia exp | (=) i() em 
q/ 2 q 


So that whether g=+1 or —1 (mod 4), we have 


2 
(3.2) = (- where = (vp + p)/24 (mod q). 
\q 


In order to apply Lemma 2 to (3.2) we set 249=s (mod q), so that 
v 


24? 


With this change in notation (3.2) becomes 


i= 


s + § (mod gq) 


3 
A = (- i) /2) (=) > where t= y/24? (mod q)- 


Applying Lemma 2, we obtain 


284 {March 

| 

= 

| q 

and 


1938] THE PARTITION FUNCTION 


THEOREM 5. If g=p*, p a prime >3, a21 and v=1—24n, then 


(0 if v is non-residue* of q, prime to q, 


3 4xm 
2 (=) q'!? cos if v = (24m)? (mod q), prime to q, 
q q 


0 if v = 0 (mod ) anda > 1, 


3 
(=)an if v = 0 (mod p) anda = 1. 


Case II. g=3, B21. First let 8 be even. Then from (1.11) and (1.13) we 
have 


= 84(p) (mod 
(3.3) t = = (vp + (mod 


so that exp ] 
If 8 is odd, then 


2 
= 8t — 6g (*) (mod 8 - 36+), 
q 
Hence in this case 
2p 
exp [fn(o)4i/12q] = — 
q 
where ¢ is given again by (3.3). Hence in both cases we have 


(3.4) A,(n) = (- 
\Q 


where p runs over the numbers prime to 3 and less than3 *. Since é(9 +34) =#(p) 
(mod 3+), if p is replaced by p—3*p? we may let p run up to 3+! in (3.4) 
and obtain 3A,(m). At the same time we replace p by s/8 (mod 38+), so that 


t(p) = s + § (mod 
and (2/9) =(s/g). Hence (3.4) becomes 


A,(n) = i) (=) | 
\Q 


* This condition should not be confused with (»/g)= —1. We mean that no solution exists of the 
congruence (mod gq). 


285 


286 D. H. LEHMER [March 


where a=v/8? (mod 34+"). Since y=1 (mod 3), we may apply Lemma 3 with 
a= 8+1 and obtain the following: 


THEOREM 6. Let v=1—24n, then 


A;6(n) = (— sin 
3 


31/2 
where (8m)?=v (mod 


Case III. g=2*. In this case we shall evaluate A,(m) directly without 
passing to a generalized Kloosterman sum, since the introduction of the ap- 
propriate quadratic character into Salié’s discussion of the corresponding 
Kloosterman sum cannot be accomplished without considerable reconstruc- 
tion. The method of proof is similar to that used by Salié and Estermann. 


THEOREM 7. Jf X20, 


A;(n) = (- sin 


where m is an integer =v'/?/3 (mod 


Proof. For brevity define u and v by 


A+ 1 
2 2 


so that \=u-+v. The numbers p which are less than 2* and odd may be repre- 
sented by 


+ 
= fT u 
h 


Hence A.\(m) may be written as a double sum 
2v—1 
(3.5) A}(n) = exp [g,(r + 2%h)wi/12- 
(rt) h=0 
For we consider the difference 
A; = +- 2"hy) —_ + 2“he). 


By (1.15), 4,=0 (mod 3). Assuming that \>4 so that u>3, and 2u=A+3 
we find from (1.16) that 


(3.6) A, = — =} (mod 2+3), 


Since 


p 
= 


1938] THE PARTITION FUNCTION 
(3.7) vy =p = 1 (mod 8), 


it follows that v is a quadratic residue of 2+*. In view of (3.6), A,=0 
(mod 2«+*) for all tr. We proceed to arrange the values of 7 into sets accord- 
ing to the highest power of 2 dividing r?—». 

If 7? (mod 2°‘), 7 will be said to belong to set 1. For such a 7, A,=0 
(mod 2“+*) for all pairs (A, 42), but since each h <2’, h,—/z is never divisible 
by 2°, that is, A, 40 (mod 2°+*). This means that those terms of (3.5) for 
which 7 belongs to set 1 correspond to g’s of the form 


gn(t + 2"h) = c, + 3M;,,2"*3 (mod 24-2), 


where M,,, runs with 4 over the numbers 0, 1, 2, - - - , 2?—1 in some order. 
Hence the contribution to (3.5) from any member r of set 1 is 


2rth 
exp | ] > exp 


h=0 


12-2» 


Hence we need only consider those r’s which do not belong to set 1. For such 
numbers r?=p (mod 2‘). For any there exists an =/,+2°-! such 
that (4;—/2)=0 (mod 2°-') and the corresponding difference A, is divisible 
by 3-2"-2°-!.24=24-2%. Since the corresponding terms of (3.5) make pre- 
cisely the same contribution to Asa(m) we may contract (3.5) to read 


(3.8) Ap(n) = exp [ga(r + 2%h)wi/12-2], 


(rT) h=0 
where now the outer sum extends over the values of 7 for which 
7? = 5b (mod 2%). 


If 7? (mod 25), 7 will be said to belong to set 2. For such a 7, A,=0 
(mod 2«+), but A, #0 (mod 2°+°), since now h,—/z is never divisible by 2°-'. 
This means that the terms of (3.8) belonging to a fixed number 7 of set 2 
contribute nothing to A,\(m). We may therefore ignore all r’s but those for 
which r?=p (mod 2°). Moreover if for any such 7, 42=/,+2°~* the corre- 
sponding A, will be divisible by 3-2«-2°-?.25=24-2* so that the corresponding 
contributions to (3.8) are identical. Hence 


= 22> exp [ga(r + 2%h)wi/12-2], 


h=0 
where now r?=p (mod 2°). Repeating the argument we may reduce the terms 
of the inner sum to a single term corresponding to h=0 and obtain 


(3.9) A}(n) = 2°>> exp [g,(r)ri/12-2], 
() 


288 D. H. LEHMER 


where 
(3.10) 7? = (mod 


At this point we separate two cases according to the parity of X. If d is 
odd, «=v+3, and since 7 is chosen from the odd numbers <2*=2*t* the 
congruence (3.10) has four solutions 


r=ty, + (y+2°**) (mod 
where y?=p (mod 2°+*). If we consider the difference 
Dy = + 2°*?) — ga(y), 
we find by (1.15) that D,=0 (mod 3), while by (1.16) 


1 
v(y + 2e+3) 
= 2+2{ py? 1+ vy2*t?} (y? + (mod 


Since vy?— 1=0 (mod 2°+*) the last factor may be taken modulo 2\+*~*-* =4, 
and is =1 (mod 4) since 720. Hence 


and 
+ = ga(y) + 3c2**! (mod 24- 2%) 


where c is an integer =1/37 = —vy (mod 4). Since g(p) = —g(—p) (mod 24-2), 
(3.9) becomes 


= 2°+1{cos + cos — yx/2]} 


Ty 
cos [gn(y)x/12-2 — yx/4]. 


But v+2=(A+1)/2 and cos ry/4 =(2/y)(1/21/”), so that we have 


2 
(3.11) A, (n) = (2/2) cos [ga(y)4/12-2* — ry/4]. 


Before proceeding further we take up the case of even. In this case 
u=v+4. Since r <2“ =2°+ there are 8 values of r which are needed in (3.9). 
These are 


r= + (y + j2°t) (mod 2°+8) (j = 0, 1, 2, and 3). 
Now if 
Dz = + — gal), 


[March 
| 


1938] THE PARTITION FUNCTION 


then D.=0 (mod 3) and 


1 
D, = - 
avy + j2°**) 


= = (mod 
Therefore 


+ J2°**) = + (mod 3-2)+8), 
so that if 


F(j) = exp + j2°t*)wi/12-2*] = exp [gn(y)wi/12- 2 
then F(0) = —F(2) and F(1) =F (3), and (3.9) becomes 
Ar(n) = cos [gn(y)e/12-2* + 3yx/4] 
= — (2/2) cos [g,(y)4/12-2 — yr/4]. 


In view of (3.11) we may combine the results for even and odd by writing 


(3.12) Asm) = (— cos — yx/4]. 


To evaluate this cosine we refer once more to (1.15) and (1.16) and write 
gn(y) =0 (mod 3) and 


2 
= ry +7 6-2( ) + (mod 2)+3), 


so that if we define the integer m by 


m = (vy + 7)/6 = v'/2/3 (mod 2), 
we have 


cos — yx/4] 


Hence substituting into (3.12) we have 


A, (n) = (- ( 


+3 


where m is an integer =v"/?/3 (mod 2)+*). But 


289 

{= (= 
| 
2 
—2\ 4xm 

= (= sin —— - 

| 7, 


290 D. H. LEHMER [March 


so the theorem follows, when A >4. 

As a matter of fact the theorem is true when \ <4. This may be verified 
by merely consulting the tables of A;(m). 

The following corollary, which is a consequence of Theorems 1, 2, 3, and 5, 
is especially useful in applying series (1.1). The proof, which involves the 
separation of three cases, is left to the reader. 

Coro.tary. If k is divisible by the prime p and if A,(n) =0, then A,(n) =0. 

4, Estimates and remainders. In this section we apply the preceding re- 
sults to the discussion of the order of magnitude of A,(m) and to the estima- 
tion of the errors committed by taking only the first V terms of the Hardy- 
Ramanujan and Rademacher series. 


THEOREM 8. Let w be the number of distinct odd prime factors of k. Then 
| Ax(m)| < 2k1/2, 
Proof. Let 


= 2 pi po 


be the decomposition of & into its prime factors. By §2, there exists a set 
(no, Nw) such that 


(4.1) A,(n) = Agr(mo)A My) « 
By theorem 7 

| Asr(m) | < 
while by Theorems 5 and 6 

| A po(n)| < 
Hence the theorem follows from (4.1). 

Since, for every «>0 
= O(K), 

we have as an immediate consequence 
(4.2) A,(n) = 
Using this result it is possible to prove the following: 


THEOREM 9. Let w be any number greater than 2x(2/3)*? =5.130 - - - , then 
for all sufficiently large n, p(n) is the nearest integer to the first [wn'/*/log n] 
terms of the Hardy-Ramanujan series. 


This theorem is similar to a result of Hardy and Ramanujan* based on 


* Loc. cit., pp. 107-108, §6.3. 


ae J 
} 


1938] THE PARTITION FUNCTION 291 


A;,(n) =O(k) in which w is replaced by 42(2/3)!/, and may be proved in the 
same way. However it is possible to prove somewhat more than (4.2). 


THEOREM 10. For every €>0 there exists a K such that for all n and for all 
k>K 


| Ax(n) | < fll2+(1+e) log2/loglogk 
Proof. This follows easily from a theorem of Wigert* to the effect that 
for every ¢ there exists a K such that if k>K 


7(k) < 


where 7(k) is the number of divisors of k, and from the trivial inequality 
< 


In contrast to this theorem we prove: 


THEOREM 11. For every e>0 there exist infinitely many values of n and k 


for which 
| A,(n) | > £11367 log2/loglogk | 


Proof. Let p be a prime >3 and let m, be an integer = —35/24 (mod ). 


Then 
= 1 — 24n, = 6? (mod f). 


Applying Theorem 5 with g= and n=n, we find m=(p?—1)/4 (mod p) and 


(4.3) |A,(n,)| = = 2p1/2 =). 
p 2p* 
Let ; denote the jth prime =>2. By Theorem 1 there exists an m such that 
A,(m) = A2(0)A3(0)As(ms) - - Ap,(%p,), 


where k=2-3-5-7- --- -p, and where ¢ will be determined later. Applying 
(4.3) we find 


| > (1 - 


j=3 2p? 


1 — ?/2(2j — 1)? 
IT /2(2j — 1)?) 
61/2 (1 — w?/2)(1 — — 32/162) 

> .12044k1/2.2*| cos (42/81/2) | 


(4.4) | Ax(m)| > .11367h1/2-2¢. 


* Archiv for Matematik, Astronomi, och Fysik, vol. 3 (1906-1907), no. 18. See also Landau 
Handbuch, vol. 1, p. 220. 


292 D. H. LEHMER 


As to the factor 2‘, we have by the prime number theorem 
log 2‘ = t-log 2 = (pz) log 2 ~ p, log 2/log px. 
But 
hm > log pt = log k. 


j=1 


log 2 log k 


log 
. log log k 


Therefore if «€>0 is given 
(1 — €) log 2 logk 
log log k 


log 2‘ > 


for all sufficiently large values of ¢ and &. In other words there are infinitely 
many k’s for which 


> log2/loglogk | 


From this and (4.4) the theorem follows at once. 
In the subsequent discussion however we need an estimate of A,(n) for 
all values of k and n. The one we shall use is given by the following: 


THEOREM 12. For all n and k 
(4.5) | Ax(m)| < 2285/6, 


Proof. Let w(k) denote as before the number of odd prime factors of k and 
let P; bethe product of the first 7 odd primes. The number k being given, there 
exists a 7 such that 


< Py. 
This means that 
(4.6) w(k) Sj. 


Suppose for the moment that k= 105 =P;. Then since Pj4:>k, we have j >3. 
Hence 


k= P; 2 3-5-7-117- 
j & .96026 logio k + 1.05914. 


Therefore in view of (4.6) 
20) < 25 < 2.0836 < 


{March 
Hence 
i 
| 


1938} THE PARTITION FUNCTION 


since k>3. It follows by Theorem 8 that 
| Ax(n)| < 2k5/6 for k = 105. 


If 15<k<105, then 2) <4<2-(15)"8 and | Ax(m)| <2(15)"/8- < 2285/6, 
For 1<k<15, <2 so that | <2 Hence the inequality 
(4.5) holds for all k. 

It is clear that an infinite number of inequalities similar to (4.5) but with 
smaller powers (>1/2) of k may be established in the same way (for example 
|4.(n)| <3k?/*), but only at the expense of larger constant coefficients. 

We now consider the remainder of Rademacher’s convergent series for 


p(n). 


(4.7) p(n) = 


where 
= 4(2/3)'/2, An = (m — 1/24)1/2, 


Introducing the notation 

(4.8) ene — 1)1/2, 
(4.9) A,(n) = 

we may write (4.7) in the form 


121/72 k k 
(4.10) p(n) = (n) {(1 4 (1 + =) + Ry(n), 
24n 


— 


where the remainder Ry(m) may be written after expanding the exponentials 
and collecting 


4(12)'2 = 
By Theorem 12 
(4.12) | Ait(n)| < 
Therefore if we eliminate m from (4.11) by means of (4.8) we obtain 


4231/2 pay J(u/k)?? 
Oy? k=N+1 j=1 (2j + 1)! 
4231/2 x 2741/3 


293 
q 


294 D. H. LEHMER 


Setting u/x =/ and defining r by 
n/N, 
we obtain on eliminating x 


4231/2 r jens 
| Ru(n) | < f 
j= (27 + 1)! 
N-2/3q72 
(27 + 3)(37 + 1)(27 + 1)! 
1 


32 13 +3)! 


< 


So that 


(4.14) | Ry(n)| < 


(sinh r 1 1 


We give a few values of F(r) for typical values of r. 
r F(r) F(r) 
1.9480 ; 2.6831 
2.0122 . 3 .0233 
~ 2.1085 3.4825 
2.2444 i 4.1044 
3. 2.4308 4.9515 


To illustrate the use of (4.14) we give the following examples. 

I. Find the maximum error committed by using only 18 terms of 
the Rademacher series for p(599). Here we have =62.777 and r=y/18 
= 3.4876. Hence F(r) =2.720 so that | Ris(599)| <.396. Actuallyt R =.00027. 

II. Find Rx(721). Here =68.8746, =r =3.2797. F(r) =2.596, and 
| <.341. Actuallyt R=.00041. 

We now consider the difference dy(m) between the sums of the WN first 
terms of Rademacher series and that of Hardy and Ramanujan. That is, in 
view of (4.10), 


21/2 N 


dy(n) = (1 + =) 
24n — 1 
Using (4.12) we find 


4g1/2 N N 
| dy(n) | < {f + + N13 (1 + cnn 
24n — 1 


Since e~*/? <<x/p we find, on writing r=A/N and 24n—1=(6y/z)?, 
¢ Journal of the London Mathematical Society, vol. 11 (1936), pp. 115-116. 


[March 

! 


1938] THE PARTITION FUNCTION 


(4.15) | dy(n)| < or? 


This estimate, crude though it is, shows that, for typical calculations of p(m), 
dy(n) is sensibly zero. 

We are now in a position to answer the question: When is the Hardy- 
Ramanujan series applicable? This may be answered in a number of ways of 
which the following is an example. 


THEOREM 13. If only 2n'/*/3 terms of the Hardy-Ramanujan series (1.1) 
be taken, the resulting sum will differ from p(n) by less than 1/2, provided 
n >600.T 

Proof. If N =2n"/?/3, then 

(24n — 1)'/2 


nil2 


Since n>600, N =3n2>16, and > 62.832. 
| Rv(n) | < F(3.847)16-2/3 < .46. 
Now since the right member of (4.15) is a decreasing function of yu, we obtain 
| dv(m)| < .0031. 


Hence the sum of the first NV terms of the Hardy-Ramanujan series differs 
from ~(m) by an amount which is less than 


| Rv(n) | +| dy(n)| < .46 + .0031 < 1/2 


in absolute value. 

The factor 2/3 of Theorem 13 may be made smaller by allowing the lower 
limit of m to increase. For example if we wish to take only n'/?/2 terms of the 
series we may do so provided »>3600. By making a general argument we 
may easily prove the following: 

THEOREM 14. Let and let --- . Then p(n) is the 


nearest integer to the sum of the first n!?/6 terms of the Hardy-Ramanujan series 
provided 


n> 


271/2¢6 (cds) 1 


3 


t The tables of p(n) for »<600 have been published by Gupta, Proceedings of the London 
Mathematical Society, (2), vol. 39 (1935), pp. 148-149; vol. 42, pp. 546-549. 


LEHIGH UNIVERSITY, 
BETHLEHEM, Pa. 


295 
i 


A PROBLEM IN ADDITIVE NUMBER THEORY* 


BY 
R. D. JAMES 


1. Introduction. Some time ago the author was asked by Professor D. N. 
Lehmer if there was anything known about the representation of an integer h 
in the form 


where all the prime factors of each 4; are of a given form. A search of the 
literature seemed to indicate that various theorems had been conjectured but 
none actually proved.} For example, L. Euler stated without proof that every 
integer of the form 4j7+2 is a sum of two primes each of the form 47+1. Even 
the weaker statement that every integer of the form 4j7+2 is a sum of two 
integers which have all their prime factors of the form 47+1 has not yet been 
proved. 

In view of the absence of any definite results in the literature it seems 
worthwhile to point out that some very interesting theorems can be obtained 
in an elementary way. This is done in Part I of this paper and the results 
are summarized in Theorems 1, 2, and 3 below. In Part II we use the method 
of Viggo Brunt to prove a general theorem and from this we deduce Theorems 
4 and 5 below. 


THEOREM 1. Consider the set of all integers n; with the property that no=1 
and that every prime factor of each n;,i=1 is of the form 47+1. Let r=3, 4, 5, 
or 6. Then every integer N=r (mod 4), Nr is a sum of exactly r integers n; 
all but three of which may be taken equal to 1. Except for r=6 this result is the 
best possible in the sense that there is an infinite number of integers N =r (mod 4) 
which are not the sum of fewer than r integers n;. 


THEOREM 2. Let N be any integer of the form 4j+2. If the integer 8j7+-2 
is of the form 2K - - - p2%, where p,=3 (mod 4), v=1, 2,--- , t, and every 
prime factor of K is of the form 4j +1, then N is a sum of exactly two integers n;. 


* Presented to the Society, April 3, 1937; received by the editors March 23, 1937. 

+ L. E. Dickson, History of the Theory of Numbers, vol. I, Chap. XVIII, and vol. II, Chap. 
VII. 

See the paper by H. Rademacher, Abhandlungen aus dem Mathematischen Seminar der 
Hamburgischen Universitit, vol. 3 (1924), pp. 12-30. 


296 


i=l 


ADDITIVE NUMBER THEORY 297 


THEOREM 3. Consider the set of all integers m; with the property that mo=1 
and that every prime factor of m;,i=1 is of the form 8j+1 or 8j+3. Then every 
odd integer M =3 is a sum of exactly three integers m; and every even integer 
M24 is a sum of exactly four integers m;. The results for M odd and 
M =0 (mod 8) are the best possible. 


THEOREM 4. Every sufficiently large integer N =2 (mod 4) is a sum of two 
integers which have all except possibly two of their prime factors of the form 4j+1. 


THEOREM 5. Every sufficiently large integer M=2, 4, 6 (mod 8) is a sum 
of two integers which have all except possibly two of their prime factors of the 
form 8j+1 or 8j+3. 

Part I 


2. Preliminary lemmas. The lemmas which follow are well known and 
we shall state them without proof. 


Lemma 1.* If k is any integer such that 
k # 0, 7, 12 or 15 (mod 16) 
then there exist integers x1, %2, and x3 such that 


Lemma 2.} If x and y have no common factor and are not both odd, every 
prime factor of x*+-y? is of the form 4j+1. 


Lemma 3. If x and y have no common factor and x is odd, every prime factor 
of x?+2y? is of the form 8j+-1 or 87+3. 


3. The proof of Theorem 1. We suppose first that r=3 so that N is of the 
form 47+3. We have 


87 + 3 ¥ 0, 7, 12, or 15 (mod 16), 
so that by Lemma 1 


(1) = Eat. 


Since x? is of the form 4j or 47+1 according as x, is even or odd, it follows 
that each x, in (1) must be odd. Let x,=2s,+1. Then (1) becomes 


* E. Landau, Handbuch der Lehre von der V erteilung der Primzahlen, vol. 1, pp. 550-555. 

¢ Lemmas 2 and 3 follow from the fact that —1 is a quadratic residue of an odd prime if and 
only if p is of the form 4j+1, and that —2 is a quadratic residue of an odd prime # if and only if p 
is of the form 8j+1 or 8j+3. 


k= > 2?. 
i v=l 
_ 


R. D. JAMES [March 


3 
+3 = > (2s, + 1)?, 


3 
N =474+3= > {s? + (s,+ 1)*}. 
Obviously s, and s,+1 have no common factor and are not both odd. Hence 
by Lemma 2 every prime factor of the integer S$? +(S,+1)? is of the form 
4j+1. This proves the first part of Theorem 1 when r=3. 

Now let r=4, 5, or 6. Then N—r+3=3 (mod 4) and thus N—r+3 isa 
sum of exactly three integers m;. It follows that WN itself is a sum of exactly r 
integers m;, all but three of which are equal to 1. 

To prove the last statement of the theorem when r=3 or 4 we observe 
that since we have each m;=1 (mod 4) the congruence 


(2) N= > n;, (mod 4) 


j=l 


has no solution when s <r. Therefore the equation 


(3) N= > Ni; 


j=l 
certainly has no solution when s <r. 

When r=5 we consider the set of all integers N=, - - - pt, where 
every p, is of the form 4j+3. It is evident that NV =1 (mod 4). For these in- 
tegers the congruence (2) has no solution when 1 <s<5 and hence the equa- 
tion (3) has no solution when s <5. 

4. The proof of Theorem 2. Since the integer 8j+2 is of the form 
2Kp;" - - - p2*t, where p,=3 (mod 4) and every prime factor of K is of the 
form 4j+1, there exist integers u and v such that* 


(4) +2 = uv? + 2. 


By the argument used in the proof of Theorem 1, both uw and v must be odd. 
Let u=2y+1, v=2z+1. Then (4) becomes 


+2 = (yt 1)? +27 + + 2, 
N=47+2= y+ (y+ 1)? (+1). 


Every prime factor of y?-+(y+1)? and 2?+(z+1)? is of the form 47+1 and 
this completes the proof. 


* E. Landau, loc. cit., pp. 549-550. 


298 
| 


1938] ADDITIVE NUMBER THEORY 


5. The proof of Theorem 3. We suppose first that M=2k+3. If 
k#0, 7, 12, or 15 (mod 16) 


we have 
3 
k= x2, 
i=l 
3 
2k+3= > (2x2 + 1). 
i=1 


By Lemma 3 every prime factor of 2x? +1 is of the form 8j+1 or 8j+3. 
If k=0 or 12 (mod 16) then 2k—21=3 or 11 (mod 16). Then* 


3 
2k — 21 = x?, 
t=1 
(S) 


3 
2k+3 =>. (x? + 8). 
i=1 


In (5) every x; is odd and the result follows from Lemma 3. 
If k=7 or 15 (mod 16) then 2k—3=11 (mod 16) and we have 


3 
2k—3= > x?, 


3 
2k+3 = (x? + 2). 
t= 1 


Again x; is odd and the theorem follows as before. 

The rest of the theorem is a consequence of the first part since M—1 is 
odd if M is even. The results can be shown to be the best possible by using 
congruential conditions similar to those used in the proof of Theorem 1. 


Part II 


6. The Viggo Brun method. In this part we use the results of the paper 
by Rademacher to which reference was made above. This will be cited as R.f 
Let fi, p2,--+, be any infinite set of primes which are all distinct. Let 
by, be, ---, be any integers such that a;+b;. For (A, D) =1 let 


P(A, D, bi, b, br) = P(D, » Pr) 


* The case k=0, M=3 is not included here but obviously M=3=1+1-+1. 

+ T. Estermann, Journal fiir die Reine und Angewandte Mathematik, vol. 168 (1932), pp. 106- 
116, has improved Rademacher’s results. For the problem which we are considering, however, Ester- 
mann’s method does not yield anything more. 


> 


300 R. D. JAMES 


denote the number of integers z which satisfy the conditions 


(6) x,2 =A (mod D), (z — a;)(z — b;) 0 (mod 
(i= 1,2,-+-+,7). 


Then by R, (8) we have 


E 
PD, 33 Pu tr) R, 


where 
1 1 


1 
B<a 
R = (2r + 1)(2r1 + 1)? -- (27, + 1)?, 
We now assume that the primes fy, - - - , p, are the first r primes in order 
of any infinite set of primes which have the property that 


1 1 
(7) dS’ — = — log log w + ai(a) + (1). 
@ 


Here >>’ or []’ denotes the sum or product over all primes of the set which 
are <w. From (7) and a general theorem on infinite series* it follows that 


2 1 
3S (log w)*/« (log 
If a=1, this reduces to the case treated by Rademacher. 
Now let # and hy be any two numbers such that 


0 < 2 log < 1. 


Then from (7) and (8) it follows that there is a number w» such that for all 
w we have 


1 
0< — < log ho, 
w<psw 


These are precisely the equations (15a) which are used in R. All the results 
obtained there go over to the case which we are considering. Thus from R, 
(18) and (26) we obtain 


* K. Knopp, Theorie und Anwendung der Unendlichen Reihen, page 218. 


[March 

1 
h? 


ADDITIVE NUMBER THEORY 


log® ho e?(e? 5) 
1 — e*h? log? ho 
R < 
where c; depends only on a, h, fo, and E,>1—2 log ho, 2 < (10 log* ho) /3. If 
we take io=1.3 we find that 
Cc 


x 
9 P(D, x; Pe) > — — 


where C and C’ depend only on a, hk. It is this inequality which we use to 
prove Theorems 4 and 5. 

7. The proof of Theorem 4. Let x in (6) be of the form 47+2. Consider 
the infinite set of primes p which are of the form 4j7+3. In this case we have* 


1 1 
— = — log log w+ a, + o(1). 
sspsw P = $(4) 


? = 3 (mod 4) 


Then a=2 and we may take h=1.68 < (1.3). From (9) we have 


C 2 
10 P(D, x; >= — C’p,268/68 
pr) D p, P 


Let fi, po, - - - , Py be the primes 7,11, - - - up to the largest prime of the form 
4j3+3 which does not exceed x!/*, We choose a; and 5; in the following manner. 


if pi | x. 
We choose A so that z=1 (mod 4) and so that neither A nor x—A is divisible 


by 3. Then A is determined (mod 12). Using the fact that p, < x'/‘ the inequal- 
ity (10) becomes 
2C 


x 
P(12, ++ Pr) > — — C'4268/272, 
3 log x 


Hence for x sufficiently large we have P(12, x; fi, --- , ~,) 21. Going back 
to the definition of P(12, x; pi, - - - , p-) we see that this means that there is 
at least one integer z such that 

— x) (mod pi), pit x; 


< = A (mod 12 
x, (m ); 2(z — 1 — x) 0 (mod pi| 


* E. Landau, loc. cit., pp. 449-450. 


1938] 301 
a;=0, b; = x, if pil x; 


302 R. D. JAMES 


This shows that there is at least one integer z for which 
x=2+(x—2), 


where neither z nor x—z is divisible by 2 or by any prime of the form 47+3 
which does not exceed x"/‘. If a prime of the form 47+3 does divide z or x—z, 
then it must be greater than x’/*, This proves that not more than three primes 
of the form 47+3 can divide z or x—z. The number three can be reduced to 
two by the following argument. Both z and x—z are of the form 47+1, but 
a product of three primes of the form 47+3 is again of the form 47+3. There- 
fore not more than two primes of the form 47+3 can divide z or x—z. This 
proves Theorem 4. 

8. The proof of Theorem 5. The proof of this theorem is only slightly dif- 
ferent from the proof of Theorem 4. We have* 
log | + 2c + o(1 

8Spsw P (8) “ 
= Sor7 (mod 8) 

and again a=2, h=1.68. This time we choose A so that z has the following 
values (mod 8) ‘and so that neither A nor «—A is divisible by 3. 


z=1, x—2 if x = 2 (mod 8), 

z=1, x-z= if x = 4 (mod 8), 

z= 3, x—z=3 if x = 6 (mod 8). 
The inequality (10) then shows that not more than three primes of the form 
8j+5 or 8j+7 can divide z or x—z. An argument similar to that used in the 


proof of Theorem 4 shows finally that not more than two primes of the form 
8j+5 or 8j+7 can divide z or x—z. This completes the proof of Theorem 5. 


* E. Landau, loc. cit. 


Tue UNIVERSITY OF CALIFORNIA, 
BERKELEY, CALIF. 


TRANSFORMATIONS OF A SURFACE BEARING A 
FAMILY OF ASYMPTOTIC CURVES* 


BY 
G. D. GORE 


Introduction. It is the purpose of this paper to establish certain transfor- 
mations for any non-developable surface that bears a family of asymptotic 
curves and is immersed in a space of m dimensions S, (w>3). All surfaces 
mentioned hereafter will belong to S,. 

The ambient of the osculating planes at a point of a surface to all of the 
curves on the surface that go through the point is a space of not more than 
five dimensions.f The class of all surfaces in S, for which the ambient at all 
points is a space of four dimensions is divided into two subclasses. One of 
these subclasses is composed of all surfaces in S, that bear each a conjugate 
net of curves, while the other is composed of all surfaces in S, that sustain 
each a family of asymptotic curves. 

In the classical transformations for a surface bearing a conjugate net, the 


two congruences of lines tangent to the curves of the net have played basic 
réles. We shall assign a similar réle to the ©? lines tangent to the asymptotic 
curves of a family. Although the congruence of these lines contains only a 
one parameter family of developable surfaces, it will be defined as a parabolic 
congruence. 

To facilitate discussion, a terminology for certain geometric relations is 
introduced. 


DEFINITION 1. The asymptotic curves of a given family are said to be auto- 
conjugate to the lines of a parabolic congruence if the curves of the family lie on 
the developable surfaces of the congruence, provided that the surface which sus- 
tains the given family is not the focal surface of the congruence. 


DEFINITION 2. A family of asymptotic curves on a surface and a parabolic 
congruence, such that there is just one line of the congruence lying in each tangent 
plane of the surface and not passing through the point of contact, are harmonic to 
each other in case the developable surfaces of the parabolic congruence correspond 
to the curves of the family. 


* Presented to the Society, November 27, 1936; received by the editors March 16, 1937. 
Tt Lane, Projective Differential Geometry of Curves and Surfaces, University of Chicago Press, 
1932, p. 124. 


303 


304 G. D. GORE [March 


DerFiniTIon 3. If two families of asymptotic curves are autoconjugate to the 
same parabolic congruence, they are said to be in relation F’. 


The transformation F’ of families of asymptotic lines is an analogue of 
the well known transformation F of conjugate systems.* 


DerFiniTIon 4. If the points of a surface are in a one-to-one correspondence 
with ©* straight lines, and if corresponding points and lines are in united posi- 
tion, the surface is said to be transversal to the lines. 


In §1 is developed a transformation of a family of asymptotic curves which 
is analogous to the transformation of Levy for conjugate nets,f while in §2 
is exhibited a method for determining all of the parabolic congruences auto- 
conjugate to a given family of asymptotic curves. There is developed in §3 
a method for determining a family of asymptotic curves in relation F’ with 
a given family of curves of the same kind. A study is made in §4 of the relation 
of several F’ transforms of a given family of asymptotic curves by means 
of the same parabolic congruence. The relation of two F’ transforms of a given 
surface by means of two different parabolic congruences is considered in §5, 
and a theorem of permutability for the transformation F’ is established in 
§6. General transversal surfaces of a parabolic congruence are examined in §7, 
and it is proved that these surfaces too are transformable by some of the 
methods which we have applied to surfaces bearing families of asymptotic 
curves. 

1. A transformation of a family of asymptotic curves. Consider a non- 
developable surface S which sustains a family of asymptotic curves and is 
immersed in a projective space of m dimensions (n>3). A parametric vector 
equation of the surface may be written in terms of homogeneous coordinates 
as y=¥(u, v). We adopt the asymptotic curves as the u-curves, and any other 
family of curves on the surface as v-curves. The resulting coordinates y of 
the generating point of the surface are known to satisfy a differential equation 
of the form 


(1.1) Yuu = + by, + cy (6 0), 


called the point differential equation of the surface. The surface S is any in- 
tegral surface of equation (1.1). 

To obtain a transformation of S, let R be a solution of (1.1). The point 
determined by the coordinates 


(1.2) += Ryu = Ruy 


is on the line which is tangent at the point y to the u-curve of S. We shall 


* See Eisenhart, Transformations of Surfaces, Princeton University Press, 1923, p. 34. 
Tt See Eisenhart, loc. cit., p. 19. 


1938] TRANSFORMATIONS OF A SURFACE 305 


show that the point x generates a surface having a family of asymptotic 
curves as “-curves. 

By computing derivatives of (1.2) and reducing them by means of (1.1), 
we obtain the relations 

x = Ry, — Ruy, 
(1.3) ty — ax = b(Ry, — Rwy), 
Luu — — — dyx = — 2bRoyu + (2bRu + buR) — OuRoy. 

The determinant of the coefficients of y, y., and y, in the right members of 
(1.3) is equal to zero. Hence the left members satisfy a linear relation. This 
relation is equivalent to the differential equation 
2Ru 


R aby 
x > 
R 


bu 
(1.4) wu. = + 


which indicates that the u-curves of the surface S(x) belong to a family of 
asymptotic curves. 

The first of equations (1.3) indicates that the surface S(x) is transversal 
to the parabolic congruence of lines which are tangent to the u-curves of S(y). 
Moreover, the family of asymptotic u-curves on S(x) is autoconjugate to the 
parabolic congruence. 

From the first and second equations of (1.3), we observe that the line 
which is tangent to the u-curve of S(x) at the point 2, lies in the plane which 
is tangent to S(y) at the point y. Hence the parabolic congruence of lines 
tangent to the w-curves of S(x) is harmonic to the family of asymptotic 
u-curves on S(y). 

The transformation (1.2) is an analogue of the transformation of Levy 
for conjugate nets. Repeated application of this transformation produces a 
sequence of surfaces which is a close analogue of a Levy sequence of conjugate 
nets. 

We shall now prove that all families of asymptotic curves that are auto- 
conjugate to the parabolic congruence of lines tangent to the u-curves of S(y) 
are obtained by transformations of the same form as (1.2). 

Let the generating point of a surface transversal to the tangent lines of 
the u-curves of S(y) have coordinates ¢. By means of a transformation y=6n, 
let new coordinates 7 for the generating point of S(y) be chosen so that 


(1.5) = mu. 


This change of coordinates transforms the differential equation (1.1) into an 
equation of the form 


306 G. D. GORE [March 


(1.6) NQuu = anu + Bn» + (8 0), 


in which certain of the coefficients are specialized by the particular choice of 0. 

In order that the surface S(£) have a family of asymptotic u-curves, it is 
necessary and sufficient that £ satisfy an equation of the same form as (1.1). 
If, by means of equations (1.5) and (1.6) we express &, &,, &, and &,,, in terms 
of 7, Mu; Mv, and Hu», and set the determinant of the coefficients of the latter 
functions equal to zero, we find that 


(1.7) Byu — = 0 


is the only restriction on the coefficients of (1.6) in order that S(£) have the 
required family of asymptotic curves. 

The general solution of (1.7) is y= 8f(v), a special case of which is 
+ =/f(v) =0. But in order for y to be zero, the above function @ must be a 
solution of (1.1). Then the coordinates £ can be written as 


= — Ouy). 


If f(v) 0, we introduce the value of , and verify that 
—at= B(n» + jn). 


A second transformation 7 = ,(v)¢ is introduced, where yu satisfies the condi- 


tion du/dv= —yf. As a result of this transformation, the above equation and 
(1.5) become 


fu — af = Bufo, 
= 


Integrability conditions on the left members of (1.8) show that ¢ satisfies a 
differential equation of the same form as (1.6), but with y=0. 

The above transformations y=6y and »=yf are equivalent to the single 
transformation y= Rf. Since this transformation changes (1.1) into a new 
equation in ¢, of the same form as (1.6), but with y=0, the function R is a 
solution of (1.1). 

The second of equations (1.8) can be written in the form ¢é 
=(u/R*)(Ry.— Ruy). This establishes 


THEOREM 1. Let S(y) be a surface bearing a family of asymptotic u-curves, 
and for which the point differential equation is (1.1). Let S(x) be any surface 
which sustains a family of asymptotic curves autoconjugate to the parabolic con- 
gruence of lines tangent to the u-curves of S(y). Then the transformation which 
sends S(y) into S(x) may be represented by a relation of the form x= Ry.—Ruy, 
in which R is a solution of (1.1). 


(1.8) 


| 
- 


1938] TRANSFORMATIONS OF A SURFACE 307 


We make use of a second solution R’ of equation (1.1) to construct the 
transformation of S(x) which is represented by the equation 


(1.9) a’ = R’y, — Ri y. 
From equations (1.1) and (1.9) is derived the relation 
(1.10) — ax’ = b(R’y, — 


which is similar to the second of (1.3). Equations (1.9) and (1.10) together 
with (1.3) indicate that the line tangent at the point x’ to the u-curve of 
S(x’), and the corresponding line tangent to the u-curve of S(x), lie in the 
plane which is tangent at the point y to S(y). Since these lines lie in a plane, 
they intersect in a point having coordinates x’’. We shall prove that the 
point x’’ generates a surface which has a family of asymptotic lines as para- 
metric u-curves, and that this family of curves is autoconjugate to each of 
the parabolic congruences formed by the lines tangent to the u-curves of S(x) 
and S(x’) respectively. 
Since R’ is a solution of (1.1), the function 


(1.11) RR! — RR, 


is a solution of the point differential equation of S(x). This fact follows from 
replacing x by R’ in (1.2). For a similar reason, the function 


(1.12) 6’ = R’R, — RR! 


is !a solution of the point differential equation of S(x’). We define y=RR; 
—R,R’. Then by (1.11) and (1.1), 


6, = a6 + by, 
6, = ab’ — by. 


(1.13) 


To determine the coordinates of the point of intersection of the two corre- 
sponding tangent lines to the u-curves of S(x) and S(x’), we list the following 
relations 

x= Ry, — Ruy, 

ty = R(ayu + — (aRu + y, 
= R’yu— Ruy, 
xt = R’'(ayu + — a(R! + bR/)y. 


(1.14) 


By eliminating y, y., and y, from (1.14), we obtain the relation 
R’ — (a0 + by)x] = — — — bp)z’]. 


308 G. D. GORE [March 


After using equations (1.13) to reduce the coefficients of x and x’ in the above 
equation, we have the coordinates 


= R'(6x, — 04x) = — — 6) x’). 


Since @ and @’ are solutions of the point differential equations of S(x) and 
S(x’) respectively, the surface S(x’’) is by (1.2) a transform of S(x) and S(x’) 
alike. By virtue of these transformations, S(x’’) has a family of asymptotic 
curves as u-curves, and this family is autoconjugate to each of the two para- 
bolic congruences formed by lines tangent to the w-curves of the surfaces S(x) 
and S(x’) respectively. 

Since, by the remark following equation (1.10), the line tangent at the 
point x’’ to the w-curve of S(x’’) lies in the plane tangent at the point x to 
S(x), and lies also in the plane tangent at the point x’ to S(x’), it follows that 
the families of asymptotic curves of both S(x) and S(x’) are harmonic to the 
parabolic congruence formed by the totality of these tangent lines. From 
these results, we state 

THEOREM 2. If two families of asymptotic curves are autoconjugate to the 
same parabolic congruence, they are both harmonic to a second parabolic con- 
gruence. 

2. Parabolic congruences autoconjugate to a family of asymptotic curves. 
Let S(x) be a surface having a family of asymptotic u-curves. Denote by S(y) 
the focal surface of a parabolic congruence that is autoconjugate to the family 
of asymptotic curves on S(x). 

The coordinates x satisfy a differential equation 
(2.1) Luu = ax, + bx, + cx, 
and the coordinates y satisfy an equation 


(2.2) Yuu = Ayu + By, + Cy. 


Since the point x is on the line tangent at point y to the u-curve of S(y), the 
coordinates x and y satisfy a relation 


(2.3) Yu try = Ux. 


To obtain the consequences of equations (2.1), (2.2), and (2.3), we take 
the derivative with respect to u of (2.3), and reduce the result by (2.2) and 
(2.3). The relation reduces to the equation 


By, + (C + — \? — Ad)y + (itu — pA — parA)x, 


which can be written in an abbreviated form with (2.3) to give the system 


1938] TRANSFORMATIONS OF A SURFACE 


Yu t ay = Bx, 


(2.4) 
Yo + py = + Tx. 


Equations (2.4) imply relations of the form (2.2) and (2.3). To demon- 
strate this fact, we compute the derivative with respect to u of the first equa- 
tion of (2.4) and eliminate zx. 

Henceforth we shall investigate the consequences of equations (2.4) in 
view of (2.1). On differentiating (2.4) we obtain the system 


Yu t ay = Bx, 
Yuv + avy + avy = Bx, + Brox, 
Yo + py = + 7x, 
Yuv + p¥u + puy = + ao + 7) + + (Tu + 00) x. 


The right members of these equations are linearly dependent. Unless the de- 
terminant of the coefficients of the left members vanishes there exists a linear 
relation in the functions yu», Yu, Yo, and y. Such a relation together with (2.2) 
would restrict the surface S(y) to be developable. We consider the case for 
which S(y) is not developable, and for which the determinant vanishes. The 
expanded form of the determinant set equal to zero is 


(2.6) Pu — a = 0. 


(2.5) 


This equation makes it possible, by means of a transformation y=¢/, to re- 
duce equations (2.4) to the form 
u = gx, 
(2.7) 
Vo = + Sx. 


On applying integrability conditions to the left members of (2.7), we obtain 
the relation 


xy(Ra + Ry + S) + x,(Rb — g) + x(Su + Re — gv) = 0. 


The left member of this equation must vanish identically in x and its deriva- 
tives. The coefficients set equal to zero give 


S = — (Ru + Ra), 
(2.8) g = Rb, 
Su = go — Re. 
By using the first and second equations of (2.8) to eliminate g and S from the 
third, we show that R is a solution of the equation 
(2.9) Ruu = — aR, — bR, + (c — ay — by)R. 


These results are summarized in the following theorem. 


309 


310 G. D. GORE [March 


THEOREM 3. Let there be given an integral surface S(x) of the point differ- 
ential equation (2.1), and let R be a solution of (2.9), with g and S determined 
by the first two of equations (2.8). Then the coordinates y obtained by quadratures 
from (2.7) determine the generating point of the focal surface of a parabolic con- 
gruence that is autoconjugate to the family of asymptotic u-curves on S(x). 


3. Families of asymptotic curves in relation F’. Two families of asymp- 
totic curves which are autoconjugate to the same parabolic congruence are, 
by Definition 3, in relation F’. 

Let S(x) and S(x’) be two surfaces bearing each a family of asymptotic 
curves, and so related that the two families are in relation F’. Let S(y) be 
the focal surface of the parabolic congruence to which they are autoconju- 
gate, the point differential equation of this surface being 


(3.1) Yuu = + + 
By reason of Theorem 1, the coordinates x and x’ can be chosen so as to 
satisfy the following relations 

Ryu Ruy, 

ty — ax = B(Ry, — Roy), 

(3.2) 

x = — Rey, 

xu — ax’ = B(R’y, — 

in which R and R’ are solutions of (3.1) for which RR, —R’R,+0. 


As a result of eliminating y and its derivatives from (3.2), we obtain the 
following relations: 


Ax = + Nx’, 


(3.3) 
x, — Bx = + + 


The coordinates x satisfy an equation of the form 


(3.4) Luu = OX, + bx, + cx, 


and the coordinates x’ satisfy a similar one. By computing the derivative 
with respect to v of the first of (3.3), and the derivative with respect to u of 
the second, and reducing the resulting system by (3.4), it can be shown that 
unless the determinant A,—B,, is equal to zero, the point ~x lies in the plane 
tangent to S(x’) at the point x’. 

With the condition A,—B,=0, equations (3.3) are transformed by the 
substitution x = 6% into the form 


1938] TRANSFORMATIONS OF A SURFACE 


= + Nx’, 


(3.5) 
= Pxi + Mx) +Qx’. 


On taking suitable linear combinations of (3.5), we obtain the relations 
— A's’ = 
xi — = Pz, + Mik». 
An argument similar to the above suffices to show that these equations can 


be transformed, by a substitution x’ =6’z’, so that A’ = B’ =0. We therefore 
write 


(3.6) 


Integrability conditions applied to the left members of (3.6) lead to a 
differential equation in ¢ of the form 
(3.7) = + 5z,, 
where 
b, 
(3.8) 
m, = pu + ap. 


We observe that in order to reduce (3.4) to the form (3.7) by a transformation 
x=6#, it is necessary and sufficient that @ be a solution of (3.4). Coefficients 
of (3.7) are d=a—20,/0, b=b. 

By eliminating m from (3.8) it is found that p satisfies the equation 


(3.9) Puu + Gpu + bp. + (du + 5,)p = 0. 
We have in conclusion 


THEOREM 4. Let S(x) be an integral surface of (3.4), and let 0, p, and m be 
solutions of (3.4), (3.9), and (3.8) respectively. Then the coordinates x’ obtained 
by quadratures from the equations 


(3.10) 


x x 
O/u 


determine the generating point of a surface S(x’) having a family of asymptotic 
u-curves in relation F’ with the family of asymptotic u-curves on S(x). 


One readily verifies that the point having coordinates »=x’ —m/(x/6) is 


311 
= 
Ey = pu + 
x 
at =m(=), 


312 G. D. GORE [March 


the generating point of the focal surface of the parabolic congruence auto- 
conjugate to the asymptotic curves on S(x) and S(x’). 

4. Transformations F’ with the same parabolic congruence. Let S(x) and 
S(x’) be two surfaces with families of asymptotic u-curves in relation F’, as 
represented by the equations 

= 
(4.1) 
xy = px, + mx,. 
The coordinates x satisfy a differential equation 
(4.2) Luu = + 


In order to determine on a third surface a family of asymptotic curves 
that is autoconjugate to the parabolic congruence of lines which connect cor- 
responding points of S(x) and S(x’), we make the change of coordinates 
(4.3) x’ = 
and define the coordinates of a point on the line xx’ as 

= x — Oxi. 
In terms of the old coordinates 
6 
(4.4) 
6’ 
The functions 6 and 6’ will be determined so that x; and x/ satisfy a system 


of equations similar to (4.1). 
By differentiating (4.3) and (4.4), and applying (4.1), we establish the 
equations 


= — — m0) + (6/ — m6.) |, 
(4.5) 


1 
= mak (0° — 0)m— xin + af ¢ — mo, — 
m m m 


If these equations are to take the form of (4.1), the coefficients of x/ must 
vanish. This gives the conditions 

6. = mé,, 

6! = + 


From the integrability conditions on (4.6), @ is a solution of (4.2), and 0’ 
is obtained by quadratures on (4.6). The function @’ is a solution of the point 


(4.6) 


1938] TRANSFORMATIONS OF A SURFACE 


differential equation of S(x’). It can also be verified that 


(8) »2(2) 


The above deductions justify the following theorem: 


(4.7) 


THEOREM 5. Let there be given two surfaces S(x) and S(x’) sustaining fami- 
lies of asymptotic curves in relation F’, and let the coordinates x and x' be chosen 
so that equations (4.1) and (4.2) hold. Then in terms of 0, a solution of (4.2), 
and 0’, a corresponding solution of (4.6), the coordinates x; defined by (4.4) de- 
termine the generating point of a surface S(x:) which has a family of asymptotic 
u-curves in relation F’ with the corresponding families on S(x) and S(x’). 


As an example of the above transformation, we shall establish the follow- 
ing theorem: 


THEOREM 6. If a surface S(x) bearing a family of asymptotic curves lies 
on a hyperquadric, any parabolic congruence autoconjugate to the family meets 
the hyperquadric again in a surface bearing a family of asymptotic curves in 
relation F’ with the first family. 


Let the equation of the hyperquadric be written as 
(4.8) = 0. 


Let the chosen parabolic congruence be autoconjugate to the family of 
asymptotic curves on S(x’), where the coordinates x’ of the generating point 
of S(x’) are defined by (4.1). From differentiating (4.8) and reducing the 
results by (4.2), we obtain the relations 


By means of these relations, it can be shown that the function 
(4.9) = >> + 
is a solution of the point equation (4.2) of S(x), and that the function 


is a corresponding solution of (4.6). 

If S(x:) is determined by the transformation (4.4), using @ and 0’ from 
(4.9) and (4.10), it is easy to show that x; satisfies the equation (4.8) of the 
hyperquadric. The surfaces S(x,) and S(x) are in relation F’ by transforma- 
tion (4.4). 


313 


314 G. D. GORE [March 


As a second example of the transformation, we consider 


THEOREM 7. If a transversal surface of a parabolic congruence lies in a 
hyper plane, it has on it a family of asymptotic curves that is autoconjugate to the 
parabolic congruence. 


Let the given surface be transversal to the parabolic congruence of lines 
that join corresponding points of S(x) and S(x’), and define the coordinates 
of its generating point as 


x— dz’. 


The equation of the hyperplane may be taken as £‘=0. If \ is determined so 
that £‘=0, its value is \=x‘/x’‘. As a consequence 


(4.11) 


The functions x‘ and x’‘ have the properties of 6 and 0’ which are required 
by the F’ transformation (4.4). Hence S(£) is an F’ transform of each of S(x) 
and S(x’). 

For the F’ transformation (4.4), which sends S(x) into S(#) by means of 
the auxiliary surface S(x’) =S(x/), we wish to determine an inverse. That is, 
we wish to determine a pair of functions @-' and (0’)-! such that 


(4.12) 
where 6 is a solution of the point differential equation of S(x), and (@’)-} 
is related to @—! by equations similar to (4.6) which give the relations between 
6’ and @. Analogues of (4.6) are obtained by solving (4.5), in view of (4.6), 


for and then replacing x/ by (0’)-' and x by 6-'. The equations are 


(4.13) 
po’ m 
= + ——__ 971. 
(6’ — 6’ — md 


Equations (4.13), as well as the point differential equations of S(x,) and 
S(x/) obtainable from them by integrability conditions, are satisfied by the 
functions 


(4.14) 


m 
1 


1938] TRANSFORMATIONS OF A SURFACE 315 


These functions also give a result consistent with (4.4) when they are substi- 
tuted into (4.12). It can be verified that 


(4.15) 
dv dv\ 0 P 


THEOREM 8. If S(x) is transformed into S(x:) by means of 0, 0’, and the 
auxiliary surface S(x'), as expressed in (4.4), then S(x:) is transformed into 
S(x) by means of the functions 6-", (0’)-!, and xi , where 0—', (0’)-1, and xj are 
defined by (4.14) and (4.3), and the transformation is expressed by (4.12). 


If by means of a second solution 6, of the point equation (4.2), S(x) is 
transformed into a second surface S(x2) by the relation 


(4.16) 


the surfaces S(x:) and S(x2) are in relation F’. It is useful to know by what 
analytical relation S(x;) is transformed into S(x.). By subtracting equation 
(4.4) from (4.16) we obtain 


(4.17) 
where the function 


(4.18) 


(4.19) 


is a solution of the point equation of S(x’/0’). 
5. Transformations F’ by two congruences. Let S(x’) and S(x’’) be two 
F’ transforms of S(x), where the differential equations of the transformations 
are 
xf = = pity + mx; 


(5.1) 


«. 
= MexXu, = proxy + 


in which the coordinates x satisfy (4.2). By means of S(x’), S(x’’), and a 


03x’ 
63 0 
63 = 02 — 
is by (4.4) a solution of the point equation of S(x), and the function 
= 


316 G. D. GORE [March 


solution @, of (4.2) we obtain two more F’ transforms of S(x), which are repre- 
sented by the equations 


(5.2) 


in which 6/ and 6/’ are obtained from (5.1) by replacing x by 6 throughout. 
There is an F’ transform of S(x’’) by means of 6/’ and S(x’). The co- 
ordinates of its generating point are determined by the equation 


(5.3) 


It is easy to show that the point x,’’ is the intersection of the lines x’x’’ and 


%1,1%1,2- 
By differentiating (5.3) and the first of (5.2), we can establish, at the end 


of some labor, the following equations: 


— (x11) = 
ou 


— 


— (95) = | os + 1, 


— moi’ 
+ — (%1,1). 
= m0, Ov 


= m0; 


These equations show that S(x,,;) and S(x;,:) are in relation F’. It follows 
that S(x;/’) and are in relation F’. 

An inverse of the F’ transformation (4.4) is given by (4.12) in view of 
(4.14). On adapting these equations to the first of (5.2) we determine —6,/0/ 
as a solution of the point differential equation of S(x1,1). If we set x1,1:= —0;/0/ 
in (5.4), there is obtainable by quadrature a corresponding solution of the 
point equation of S(x//{). We verify that x{‘/= —0{’/0/ is such a solution. 
Using these two solutions, we construct the transformation 

1 
It can be shown that x{. =%:,2 by means of equations (5.2) and (5.3). These 
results are summarized in 

THEOREM 9. If a family of asymptotic curves on S(x) is transformed into 


two other families of asymptotic curves on S(x:,1) and S(x,2) respectively by 
means of the same functio.: 0;, the latter two families are in relation F’; moreover, 


6; 
— x", 

61 , = 

= 
x’. 
1 


1938] TRANSFORMATIONS OF A SURFACE 317 


any two of the three families S(x), S(x1,1), S(x1,2) are transforms of the third by 
means of the same solution of the point differential equation of the third. 


The three families form a close analogy to a triad of conjuage nets.* 
6. A theorem of permutability of transformation F’. From (5.2) it can be 
seen that a solution of the point differential equation of S(x:,1) is given by 


A 
(6.1) = 6: —-— 63, 
where 6, is a solution of (4.2), and 6; is a corresponding solution of the first 
pair of (5.1). To get a solution of the point equation of S(x{/’), let 6/’ be a 
solution of the second pair of (5.1) corresponding to x=6.; then from (5.3) 
we have the desired solution 
(6.2) O12 = 0’ — 
By means of the above solutions 6:2. and jg’, we construct the following 
transformation of S(x;,1): 


6201 0:02 


_ pit 
62° 6; ° 62 


(6.3) = *%1,1 — 1,1. 
Using the solution 62, and equations similar to (5.2) we have two more 
transformations of S(x) as follows: 


(6.4) 


Corresponding points of S(x), S(x1:,2), and S(x2,2) are on a straight line. 
They are in relation F’ by pairs. To determine the analytic relation by which 
S(%1:,2) is transformed into S(x2,2) we compare equations (5.2) and (6.4) to 
(4.4) and (4.16), and draw conclusions corresponding to (4.17), (4.18), and 
(4.19). The results show that 


(6.5) 


is the required solution of the point equation of S(x:,2), and that 62’ /0’ is 
a solution of the point equation of S(x’’/0{’). That is, S(x2,2) is a transform 
of S(x1,2) by means of the solution (6.5). 

From equations (5.5), placing x/, =%1,2, and (6.3), it is seen that xq: is a 
transform of S(x:,2) by means of the function 


* Eisenhart, loc. cit., p. 44. 


4 
| 
= x 2’, = x —— 2x”. 
|| 62 — — 


318 . D. [March 


(6.6) 


which is a solution of the point equation of S(x:,2). This expression is reduci- 
ble to (6.5) by (6.1) and (6.2). Hence we have shown that S(x2,2) and S(x<2)) 
are transforms of S(x:,2) by means of the same solution of its point differ- 
ential equation. It follows that S(«q2)) and S(x2,2) are in relation F’. We shall 
now investigate S(x;12)) through its relation to S(%2,2) and S(%2,1). 

The point where the line x’x’’ intersects the line +2,2%2,, has coordinates 


” 


(6.7) x", 
which are obtained by subtracting equations (6.4). From equations (6.4) we 
obtain as a solution of the point differential equation of S(x2,2), 


A corresponding solution 631’’ is obtained from (6.7) as 


” 
62 


From these two solutions, we construct the transformation 


6:02’ — O61’ ,,,, 


= — 8/02’ — — %2,2 - 


Using the foregoing equations, it is easy to show that 
(0102 — 02'61) x’ + (0/0: — 
— Og 


In order to estimate the prevalence of the transformations F’ that exist 
for given families of asymptotic curves, we count the constants in the fore- 
going quadratures. From the manner in which 6 , 0/’, 02 , 62’ were obtained, 
each contains an arbitrary constant. If S(:,1) and S(x2,2) are chosen trans- 
forms of S(x), the constants in 6) and 6/’ are determined by the choice. The 
constants in 6/ and 6’ are left arbitrary. From these facts, we state for 
transformations F’ a theorem of permutability. 


THEOREM 10. Jf S(x1,1) and S(x2,2) are two F’ transforms of S(x) by means 
of functions 0, and 62, and two distinct parabolic congruences, there exist ~? 
surfaces S(xa2)), each bearing a family of asymptotic curves in relation F’ with 
S(xi1) and S(x2,2). 


= X21) — = 


A ow 
O12 — » 
1 


1938] TRANSFORMATIONS OF A SURFACE 319 


By employing the notation used in §§4, 5, and 6, which is similar to that 
used by Eisenhart,* the equations in these sections can be given metric inter- 
pretations which yield a theory of parallel transformations for families of 
asymptotic curves, and also radial transformations of the same. 

7. General transversal surfaces of a parabolic congruence. Consider S(x), 
the most general transversal surface of a parabolic congruence. Let the curves 
cut out on S(x) by the developable surfaces of the congruence be used as 
parametric u-curves. It has been demonstrated by the authorf that under 
these conditions the coordinates x satisfy an integrable system of differential 
equations of the form 


= + + + dxy + ex, + fx, 


7.1 
= + 2b’ + toy + + + f'x. 


These equations require the following conditions of integrability: 


c=2b-—c'=0, 
ac’ — ci +e 2b’c’ = 0, 
2b, + 2ab’ — 2a’'b + d — 2b/ — 4b"% —e’ = 0, 
a,—a, — d’ — = 0, 
+ ae’ — — 2b’'e’ — +f =0, 
d, + ad’ — a’d — 2b'd’ — — f' =0, 
fo taf’ —a’f — 20’f' — fi =0. 
In the sense of Definition 1 the u-curves of the surface S(x) are auto- 
conjugate to the parabolic congruence to which S(x) is transversal. 
To determine the focal surface S(y) of a parabolic congruence which is 


autoconjugate to the u-curves of S(x), an integral surface of (7.1), consider 
the point having coordinates 
(7.3) Y = — — + — a'c’ — e’)x. 
Using equations (7.1) and (7.2) it can be verified that x and y are related by 
equations of the form 
u — (a — 26’)y = Gx, 

(7.4) Yu — ( dy 

— = hn + ky. 
On eliminating x from (7.4) it is found that the coordinates y satisfy an equa- 
tion 


* Eisenhart, loc. cit., ch. 2. 
¢ Gore, University of Chicago Dissertation, 1932, p. 47. 


320 G. D. GORE 


(7.5) Yuu = AVu + By» + 


The first of equations (7.4) shows that S(x) is transversal to the u-tangent 
lines of S(y), while equation (7.5) indicates that the u-curves of S(y) are a 
family of asymptotic curves. 

Since the surface S(y) can be transformed into a sequence of surfaces by 
transformations of the type of (1.2) and since the surface S(x) is transversal 
to the connecting tangent lines between two consecutive surfaces, we can 
make use of a general inscribing theorem* which we quote: 


“Let T denote a sequence of surfaces in which the points of each surface 2 i+: 
are joined in a one-to-one manner to the corresponding points of 2; by a set Q; 
of ©* osculating spaces of v dimensions belonging to the curves on the surface Z;. 
Let =; be any surface that is transversal to the set of osculants Q,. Then it follows 
that the transversal surface 2; belongs to a sequence of surfaces T' which is in- 
scribed in the given sequence T.” 


The above theorem shows that the surface S(x) belongs to a sequence 
that is inscribed in the sequence to which S(y) belongs. But since S(y) can be 
transformed into a multiplicity of sequences, the same is true of S(x). The 
transformations that operate to produce them are similar to (1.2). 


* Gore, Inscribed sequences of surfaces associated with generalized sequences of Laplace, these 
Transactions, vol. 36 (1934), p. 532. 


CENTRAL Y.M.C.A. COLLEGE, 
Cuicaco, ILL. 


ON FUNCTIONS WITH BOUNDED DERIVATIVES* 


BY 
OYSTEIN ORE 


1. The following well known theorem is due to A. Markoff :t 


Let f(x) be a polynomial of degree n and let My be the maximum of |f,(x)| 
in the interval (a, b). One then has for the same interval 


2M.-n? 


(1) | s 


The equality sign can only hold for the polynomials 
2x—a-—b 
(2) fala) = 


b-—a 


where 7,,(x) is the mth Tschebyschef polynomial. 

We shall show that this theorem may be formulated in such a manner 
that it holds for arbitrary functions with a certain number of derivatives. 
A polynomial of degree is characterized by the property that its (w+1)st 
derivative vanishes identically. The theorem of Markoff may be considered 
as a theorem on functions having a bounded (m+1)st derivative in a certain 
interval. One also obtains bounds for all derivatives from the first to the nth. 
Similar results may be obtained for analytic functions bounded together with 
some derivative in a part of the complex plane. The proofs are simple and 
depend upon the polynomial character of the Taylor expansion. It should be 
remarked that the same extension principle may be applied to several other 
theorems on polynomials. 

2. We shall first prove: 


THEOREM 1. Let f(x) be a function for which derivatives up to the (n+-1)st 
exist. Let 


(3) | (x) |S Mo, | S Mans 
in the interval (a, b). Then one has in the same interval 

(b a)"*1 


* Presented to the Society, March 26, 1937; received by the editors March 20, 1937. 
+ A. Markoff, Sur une question posée par Mendeleieff, Bulletin of the Academy of Sciences of St. 
Petersburg, vol. 62 (1889), pp. 1-24. 


2 2 
—a 


321 


322 OYSTEIN ORE [March 


To prove Theorem 1 we apply the Taylor expansion in the following form 


(n) 


1! 


(5) + h) = f(x) +h 
where* 
1 
6) Ry = (= 
mid, 
We now suppose x fixed in the interval (a, 6) and let 4 vary such that 
x+h belongs to the same interval. Hence 
(7) 
For the remainder term (6) we then easily find 
| (b — 
(n + 1)! (n + 1)! 
We next consider the polynomial 


(n) 
(9) P(h) = f(x) + (x) 


1! n! 


(8) | Ra| S Mass 


From (5) follows 
and from (3) and (8) in the interval (7) 


= 0 (n + 1)! n+1™ . 


By applying Markoff’s theorem to the polynomial P(h) we find 


2n?2 
| P(n) |< ——-K 
b-—a 


and, since hk =0 belongs to the interval (7), we have 


2 2 
| P| =| s 
—a 


3. When f(x) is a polynomial, Theorem 1 obviously reduces to the theo- 
rem of A. Markoff. One may state Theorem 1 briefly by saying that when 
f(x) and f+» (x) are bounded in (a, 6) then f’(x) has the same property. By 
repetition one obtains a bound for all intermediate derivatives 


* Professor Hille pointed out the advantage in using this form for the remainder term. 


FUNCTIONS WITH BOUNDED DERIVATIVES 


f(x) 
A better bound is obtained however by applying the preceding extension prin- 
ciple directly to a more general theorem by W. Markoff :* 
Let f,(x) be a polynomial of degree n and M, the maximum of | f,(x)| in the 
interval (a, b). Then one has in the same interval 


(i) 


(x)| S K(i, n)- 


(11) 
where 
— 1) -- + (n®? — (i — 1)?) 


nN 
12 K 4, n) = = - 22%. i,2i- 


The equality sign can only hold for the Tschebyschef polynomials (2). 
When this result is applied to the derivatives of the polynomial (9) in the 
interval (7) we obtain: 
THEOREM 2. Let f(x) possess an (n-+-1)st derivative f"+(x) such that in 
the interval (a, b) 
| Mo, | for S Man. 
Then all intermediate derivatives of f(x) are bounded in the same interval by 
(b a) 
(n+ 1)! 
Let us observe that the results of W. Markoff were proved only under 
the assumption that the polynomial f(x) has real coefficients. One may how- 
ever easily extend the results to complex coefficients and hence Theorem 2 


to complex valued functions. 
If one introduces the notation 


n 1 
13 (x) | < ———: Mo + Maar: 
( ) lf o+ +1 


then 


M; 
(15) = (6 — a)é 
1: 
and the inequality (13) may be written in the simpler form 


(16) M:< (Mo + Masi). 
n+t 


* W. Markoff, Uber Polynome die in einem gegebenen Intervalle miglichst wenig von Null abweichen, 
Mathematische Annalen, vol. 77 (1916), pp. 213-258. 


1938] 323 


324 OYSTEIN ORE [March 


This relation shows that the constants M; or M; for a real, differentiable func- 
tion satisfy certain restricting conditions. It suggests the very interesting 
problem of determining the necessary and sufficient condition in order that 
a series of positive numbers 


Mo, 


be the maxima of the derivatives of a function f(x) in an interval. 

4. Let us next turn to the case of analytic functions. Let us suppose that 
f(z) is analytic and regular in a certain domain D in the complex plane. Fur- 
thermore, f‘"+»)(z) is bounded in the same domain. Let D be bounded by a 
Jordan curve C such that one may draw through each point in D a chord of 
length d>0 entirely contained in D. When Theorem 1 is applied to the chords 
of D, one finds that in D 


(ate + Mess) 
This remark gives extensions of results obtained by Jackson,* Sewell,f and 
others. For the higher derivatives one finds corresponding to (13) 


| #(2)| Ki n)-K, 


where K(i, m) is given by (12). 

One may however obtain considerably better results through another pro- 
cedure. For polynomials in the complex plane we have the following theorem 
of S. Bernstein:t 

Let f(z) be a polynomial of degree n and let Mo denote the maximum of 
Fn(z) on a circle with radius R. Then we have on the same circle 


(17) 


The equality sign holds only for 
(18) fols) = *) 


* D. Jackson, On the application of Markoff’s theorem to problems of approximation in the complex 
domain, Bulletin of the American Mathematical Society, vol. 37 (1931), pp. 883-890. 

t W. E. Sewell, Generalized derivatives and approximation by polynomials, these Transactions, 
vol. 41 (1937), pp. 84-123. This paper gives further references particularly to Szegé and Montel. 

1S. Bernstein, Lecons sur les propriétés extrémales et la meilleure approximation des fonctions 
analytiques d’une variable réelle, Paris, 1926. See also M. Riesz, Eine trigonometrische Inter polations- 
formel und einige Ungleichungen filr Polynome, Jahresbericht der Deutschen Mathematiker-Vereini- 
gung, vol. 23 (1914), pp. 354-368. 


1938] FUNCTIONS WITH BOUNDED DERIVATIVES 325 


By means of the same extension principle which we used in the preceding 
this theorem may be extended to arbitrary analytic functions. Let f(z) be a 
function analytic on the circle C. It is not necessary to assume that f(z) is 
regular in the interior of C. The results hold even when f(z) is a branch of an 
analytic function not returning to its original value by a circuit of C. 

In all cases there exists a Taylor expansion 


h h 
(19) fe + h) = fle) + +R, 


where 
_(-1)" 


nN. 


(20) Rn 


zth 
f (¢ — — 


Here z and z+ are points on C and the path of integration is taken along C 
in some fixed direction. Furthermore f(z+-/) is the value of f(z) determined 
by the path. Let us now suppose that for the chosen branch of f(z) we have 


| | M, | | < Masi 


for the points of C. For the remainder term R, in (20) one finds the estimate 
x/2 


2 
| (2R)"*!- Masi: f cos” odd. 
n! 0 


The last integral tends to zero with increasing . Let us use however only the 
rough estimate 


2 


Now let z be a fixed point on C. Since z+4 is also located on C the point h 
describes another circle C’ with the same radius. Let us write again 


(22) P(h) = f(z) + 
According to (19) and (21) we have for the points on the circle C’ 
(23) | | Mo += = 0. 

By applying the theorem of Bernstein, we obtain 


nN 
| P()| 5-0 


and, since h=0 is located on C’, 


OYSTEIN ORE 


=|/@| = 0. 


By specialization to the unit circle this theorem may be stated as follows: 
THEOREM 3. Let f(z) be a function which is analytic on the unit circle Ci. 
If then 
| Mo, | f*P@) | S Mans 


on C,, then 


n! 


+ Muss) 


on the unit circle. 


The generalization of this theorem to higher derivatives implies the fol- 
lowing extension of the theorem of Bernstein: 


THEOREM 4. Let f,(z) be a polynomial of degree n and let My denote its 
maximum on a circle with radius R. Then one has on the same circle 


(i) Mo 
| fn (2) | n(n 


The equality sign can hold only for the polynomials (18). 

To prove Theorem 4 it is only necessary to apply the theorem of Bern- 
stein 7 times. 

When Theorem 4 is applied to the polynomial P(%) in (22) we obtain 


n! Q 
(n — i)! 


where (Q is defined by (23). For 4=0 we find the desired result 


| P(a)| < 


n— i)! Ri n! 
THEOREM 5. Let f(z) be analytic on the unit circle and 


on the circle. One then also has 


| | < 
(n — i)! 


YALE UNIVERSITY, 
NEw Haven, Conn. 


326 
Qntl 


COMPARISON OF PRODUCTS OF METHODS 
OF SUMMABILITY* 


BY 
RALPH PALMER AGNEW 


1. Introduction. A sequence s, of complex numbers (or complex-valued 
functions) is called summable to L by the method of summability 


(A) Sn = Do 
k=1 

determined by the matrix A =(a,,) of real or complex constants, if the trans- 
form S, exists and lim, ...S,=L. The matrix A (and method of summability 
A) is called row-finite if for each n, a,,=0 for all sufficiently great k; and is 
called triangular if a,,=0 for k>n. The method A is regular if s,—LZ implies 
S,—L. Necessary and sufficient conditions that A be regular are, by the 
Silverman-Toeplitz theorem, 


(1.1) > | Ank | <M, M = constant, 


k=1 


(1.2) for each k, lim an, = 0, 


(1.3) lim >> an = 1. 


N72 foo] 


The set of sequences summable A is called the convergence field of A. 
Let 


k=l 


denote a second method of summability. 

In case each sequence summable B is summable A to the same value, A is 
said to include B and we write A > B. In case A> Band BA, A and B are 
called equivalent and we write A~B. In case the equality L4=Lz, holds for 
each sequence summable A to L, and summable B to Lz, the methods A and 
B are called mutually consistent (or consistent). 

In terms of A and B it is possible to define two “products,” each of which 
is a new method of summability. The iteration product, ordinarily denoted by 


* Presented to the Society, February 20, 1937; received by the editors March 27, 1937. 
327 


UNIVER TS 
= OF LIBERAL AR 
= 
LIBRARY 


328 R. P. AGNEW [May 


AB, is the method which associates with a given sequence the A transform 
of its B transform, that is, 


p=1 k=1 


Thus s, is summable AB to L if lim U,=L. The composition product is also 
at times denoted by AB; it is the method whose matrix is the product AB 
(which we denote by A -B) of the matrices A and B. Thus we write 


(A -B) Va > On pb pkSk 


k=1 p=1 


and s, is summable A -B to L if V,—L. 
We observe that U, and V, are, if they exist, respectively the “sum by 
rows” and the “sum by columns” of the double series 
Gn1b1151 + + + - - - 
+ + + + - - 
+ + + + - 


(1.4) 


If A and B are regular and s, is bounded, the series (1.4) converges abso- 
lutely and U,=V,; but without these restrictions it is not so obvious that 
U,=V,. There is in fact the possibility that AB and A-B may fail to be 
equivalent or even consistent. 

It is the main object of this paper to compare pairs selected from the four 
transformations A, B, AB, and A-B, considering in each case questions of 
inclusion, equivalence, and consistency. It appears that unless either or both 
of the matrices (a,,) and (b,,) are assumed to belong to restricted types, 
the results obtained are largely negative. These negative results are estab- 
lished by examples. Several examples are explicitly given, each for two rea- 
sons. In the first place each example, consisting of two regular methods of 
summability satisfying prescribed conditions and a sequence, can be manu- 
factured only after considerable experimentation. In the second place the 
examples are largely of such obviously pathological character that they leave 
hope of obtaining positive theorems involving matrices of restricted types. 
Some such theorems are given in this paper, particularly in §11. It is doubt- 
less true that more (and better) theorems of this kind will appear in the fu- 
ture. 

In §12 we compare AB with A’B’ and A-B with A’-B’ where the pair 
A, A’ and the pair B, B’ represent closely related methods of summability. 


o 


1938] PRODUCTS OF METHODS OF SUMMABILITY 329 


In §13 we deal briefly with multiple products and in §14 with kernel trans- 
formations. 

2. Comparison of methods A and B. It is well known that two regular 
row-finite methods A and B with a,,20, b,, 20 may be such that all of the 
relations A > B, B> A, A~B are false and in fact such that A and B are in- 
consistent.* 

3. Comparison of methods A and AB. It follows from §2 that we can 
choose regular row-finite methods A and B, with a,,20, b,,20, and a se- 
quence s, summable A to L4 and summable B to Lz;¥La. It is easy to see 
that regularity of A implies that s, is summable AB to Lz. Thus A and AB 
may be inconsistent, and there is no hope of showing that A> AB, ABA 
or A~AB. 

4. Comparison of B and AB. Elementary examples show that B> AB 
may be false, and hence that B and AB need not be equivalent, even when A 
and B are assumed to be regular, row-finite and a,,20, bn, 20. 

However if A is regular and s, is summable B to L, then T7,->L and regu- 
larity of A imply U,—L so that s, is summable AB to L. It follows that if A 
is regular, then AB > B. This implies that B and AB must be consistent. 

5. Comparison of A and A-B. When A and B are determined as in §3, the 
series (1.4) from which U, and V, are computed reduces to a finite sum, and 
obviously U,=V,; hence in this case A-B~AB. It follows from §3 that A 
and A -B may be inconsistent, and there is no hope of showing that A > A -B, 
A-B>A,orA~A-B. 

6. Comparison of B and A-B. Elementary examples of row-finite trans- 
formations show that B > A -B may be false and hence that B and A - B need 
not be equivalent. 

One might expect to be able to show that if A and B are regular, then 
A-B>B. But this is impossible. The author} has given an example of trans- 
formations A and B (having some significant properties in addition to regu- 
larity) and a sequence s, which is summable B but non-summable A - B. 

In this paper we go further and prove in §9 the following theorem: 


THEOREM 6.1. There exist regular transformations A and B with an.29, 
bnx.20 for which B and A -B are inconsistent. 


7. Comparison of AB and A -B. It is not true that regularity of A and B 


* Questions involving inconsistency of transformations are discussed in detail in Agnew, On 
ranges of inconsistency of regular transformations and allied topics, Annals of Mathematics, vol. 32 
(1931), pp. 715-722. 

t Agnew, Products of methods of summability, Bulletin of the American Mathematical Society, 
vol. 42 (1936), pp. 547-549. 


| 


330 R. P. AGNEW [May 


implies either AB > A-B or A-B>AB. In fact we prove in §9 the following 
theorem: 


THEOREM 7.1. There exist regular transformations A and B with a,.2=0, 
bn. 290, for which AB and A -B are inconsistent. 


This theorem evidences the necessity of noticing a distinction between the 
iteration product AB and the composition product A - B. 

The transformation A of Theorem 7.1 cannot be row-finite. For if A is 
row-finite, then all of the terms lying below some row of the double series 
(1.4) from which U, and V,, are computed vanish, and existence of U, implies 
existence of V, and the equality V,=U,. Thus we have the theorem: 


THEOREM 7.2. If A is row-finite, then A-B> AB. 


This implies that if A is row-finite, then AB and A -B must be consistent. 
It is however impossible to go further and prove that AB and A-B must be 
equivalent. We prove the following theorem: 


THEOREM 7.3. There exist regular transformations A and B with A row- 
finite, 20, bn, 20, and a sequence such that is summable A -B but non- 
summable AB. 


Let p:, po, ps,--- denote in order the primes 2, 3, 5,---. For each 


n=1,2,3,--- , let =@n,p,2= 1/2 and let a,,=0 otherwise. If is neither 
a prime nor a square of a prime, let },,=1, and 5,;,=0 otherwise. For each 
n=1,2,--- let when k¥ -- - , and let b,,,,.=2-* when k 
is of the form Let when k¥ p,', - - - ,and let 
when £ is of the form p,2«. These matrices a,;, and b,;, define methods A and B 
of summability satisfying the hypotheses of the theorem. Observe that 
dnr=bn.=0 when nm>k. Let the sequence s, be defined by the formulas: 
$;,=2°+!/a when & is of the form p,2¢-!; s,=—2°+!/a when & is of the form 
and s, =0 otherwise. 

It can be shown that for this example the double series (1.4) from which 
U, and V, are computed becomes (after omission of rows and columns of 
zeros) 


It is apparent that, for each , the sum by columns of this series is 0, that is, 
V,,=0; and that the sum by rows does not exist, that is, U, does not exist. 
Thus the sequence of s, of the example is summable A -B to 0 and is non- 
summable AB. This proves Theorem 7.3. 

The transformation B of Theorem 7.3 cannot be row-finite. For if both 


(7.31) 


1938] PRODUCTS OF METHODS OF SUMMABILITY 331 


A and B are row-finite, then U,,=V, for every m and equivalence of AB and 
A -B follows. 

In spite of the fact that the transformations AB and A - B of Theorem 7.1 
need not be consistent, there is a large class of sequences (including all 
bounded sequences and all unilaterally bounded real sequences) over which 
they must be equivalent. 

We shall say that a sequence s, lies in an angle less than 7 in the complex 
plane if there exist a point 2, an angle @, and a positive angle ¢<7/2 such 
that for each n 


(7.32) Sk = 20 + 
where p,=0 and | <¢. 
THEOREM 7.4. If A and B are regular transformations with 
(7.41) 0, bar =O, a,b @1,2,---, 


then each sequence s, which lies in an angle less than x in the complex plane and 
which is summable to L by one of the methods AB and A -B is also summable to 
L by the other one. 

The gist of this theorem is that for regular transformations A, B satisfying 
(7.41), the two transformations AB and A -B are equivalent in so far as ap- 
plication to sequences lying in an angle less than z is concerned. 

To prove the theorem, suppose first that s, is a sequence given by (7.32) 
for which V , exists. Then 


(7.42) Vn = Dd pr [zo + prec], 


k=1 p=1 
and, where &, and ; are the real and imaginary parts of p,e**, 
(7.43) Vn = pe + €% + ine]. 
k=1 p=1 k=1 p=1 
Since a,, 20, b,,20, 20, and 7; is real, this implies convergence of 
(7.44) An pb pk | | 
k=1 p=1 


But || < &, tan ¢ so that (7.44) converges when | £,| is replaced by | |. 
It now follows easily that both series in (7.43) converge absolutely and hence 
that the series in (7.42) converges absolutely. Therefore 


(7.45) Un = Dd anpb pel zo + | 


p=1 k=1 


| 


332 R. P. AGNEW [May 


exists and U,=V,; hence V,—L implies also U,—L. We can show similarly 
that U,—L implies also V,—L, and Theorem 7.4 is proved. 
8. A double series. In the next section we use the double series 
+--- 
+--- 
+--- 


whose terms u,, may be defined as follows: for each odd k 


Unk = 1/k if nm = (k + 1)/2, 


(8.2) 
0 otherwise; 


and for each even k 


Unk = — 1/k if n = nx, 
(8.3) 
= 0 otherwise, 


where n,; is the smallest ” for which 
(8.4) Uni + Ung + Uns + — 1/k = 0. 


The harmonic series >» 1/k being divergent, it is easy to see that for each n, 
the infinite series 4,1-+4,2+ - - - converges to 0. Hence the double series (8.1) 
converges by rows to 0 and by columns to log 2=1—1/2+1/3-—1/4+ ---. 
Moreover it can be shown by arithmetic methods that for each n, u,,=0 for 
all sufficiently great k; that is, each row of (8.1) contains only a finite number 
of non-vanishing terms. 

9. Proof of Theorems 6.1 and 7.1. We can now prove the following theo- 
rem of which Theorems 6.1 and 7.1 are obvious corollaries: 


THEOREM 9.1. There exists a pair of regular transformations A and B with 
B row-finite, 


(9.11) ann 2 0, bnk 


(9.12) =1, 


k=1 k=1 


and a sequence s; such that 


(9.13) Tn = >, barse = 0, 


k=1 


> 0, n,k=1,2,---, 


PRODUCTS OF METHODS OF SUMMABILITY 


Un = = = 0, 1 =1,2,--- 
p=1 


p=1 k=l 


and 


(9.15) Va = 1, ”=1,2,--- 


k=l p=1 


Let the positive integers 1, 2, 3,--- be displayed as a double sequence 
so that hy =1, hoy = 2, hy =3, hy, =4, Ion =5, hu=7, etc. For each 
n=1, 2,3,---, let an, be defined for k=1, 2,3,--- by the formula 

Qnk = 0, k Ine, Ins, 


9.16 
(9.16) k = 


This matrix a,, determines a regular method A of summability with a,,20 
and > for each n. 
The double series (1.4) from which U,, and V, are computed takes (after 
removal of rows of zeros) the form 
1181 + 1,252 + + 
+ + + + 


(9.17) 
+ 5.151 + 3,252 + 


Let, for each m and r, 


(9.18) Din = 0, k# has, 


Then the double series (9.17) takes (after removal of columns of zeros) the 
form 


+ 0 + + in + 


(9.19) 
0 + 0 +> BD 


We observe that if ”’n’’, then the “variables” 6.3 and sy appearing in 
(9.19), when =n’, are distinct from those appearing when n=n’’. For each 
n=1, 2,3,--- let the elements bag and s, appearing in (9.19) be determined 
so that the terms of the two series (9.19) and (8.1) in corresponding positions 
will be equal, and the non-vanishing elements of the sequence 


will be equal in order to 


1938] 333 


R. P. AGNEW 
1/2d, 1/2°d,-- , 1/2%d 


where d=1/2+1/2?+ --- +1/2°. 

This gives a complete and unique determination of the elements of the 
matrix B=(b,,). It is clear that B satisfies the hypotheses of the theorem, 
regularity being implied by the conditions 


> bnk 
k=1 
and the fact that b,,=0 when n>k. 

It follows from identity of (9.19) and (8.1) and the fact that each row of 
(8.1) converges to 0, that 7,=0 for each p=1, 2, - - - ; and since (8.1) and 
hence (9.19) converge by rows to 0 and by columns to log 2 we have U,,=0 
and V,,=log 2. If finally we divide each s, determined above by log 2, then 
T,, U,, and V, will be divided by log 2 and we obtain (9.13), (9.14), and (9.15). 

10. Remarks on Theorem 9.1. The author has been unable to find an ex- 
ample less recondite than the one just given to prove Theorem 9.1. In case 
the requirements a,,20, b,,20 are removed, we can give simpler examples. 
For definiteness, and convenience of reference we state the following theorem: 


TueoreM 10.1. If r is a complex number with 0 <\|r| <1, then the methods 
(A) Sat r Sait , 
(B) T, = [1/(1 — [— 1/1 — 


are regular while AB and A-B are inconsistent. The sequence s,=(1—r)r-* is 
summable AB to 0 and A-B to 1. 


Verification is straightforward and left to the reader. We note that if 
0<r<1, the elements a,, are all 20 but some elements 5), are <0; while 
if —1<r<0O, all elements b,, are 20 but some elements a,; are <0. If r is 
not real, the conditions a,,20, b,, 20 both fail. 

The method A of Theorem 10.1 is, for each admissible r, equivalent to 
convergence. For on one hand A is regular. On the other hand if s; is sum- 
mable A to L so that 


(10.11) Sn = Sn t 
exists and S,—>L, then convergence of the series in (10.11) implies that 


(10. 12) lim + +--+) = 0, 


and hence that 
(10.13) lim s, = lim S, = L. 


1938] PRODUCTS OF METHODS OF SUMMABILITY 335 


The method B is not only regular but also has several other features at 
times desirable in methods of summability. The permissibility of removal or 
adjunction of elements at the beginning of a sequence is such a feature. 

These remarks make it appear likely that significant theorems giving con- 
ditions sufficient for consistency of AB and A-B (or for AB>A-B, or for 
A-B2>AB, or for AB~A -B) will invoive classes of methods defined by mat- 
rices of more or less restricted types rather than involve classes of methods 
having various ones of the numerous “desirable” properties of methods of 
summability. 

The following theorem indicates the possibility of obtaining constructive 
theorems involving AB and A - B, and is of interest in connection with Theo- 
rem 10.1: 


THEOREM 10.2. if A and B are regular transformations with an. 29, bn, 20 
and if Bis of the form 


n+p 


DnkSk, 


k=n—a 


where a and B are non-negative integers, then ABD A-B. 
In interpreting B, we agree that s,=0 when & <1, and that b,,=0 when 
k<p—aand when k>p+8. Assuming s;, to be a sequence for which 


k+a 


(10.21) An pd = Dy Dy Anpbpksi 


k=1 p=k—8 
exists, we have for fixed n 
Q 


Q 


k=l p=1 p=1 k=1 


Q-8 Qta Q 


p=1 k=1 p=Q—8+1 k=1 


(10. 22) 


and hence 
Q 


Q-8 
(10.23) V, = lim { anpd, + ¥ 


p=1 k=l p=Q-8+1 kemQ—a—S+1 


The operations under the limit sign are justified by vanishing of elements b,,. 
Now convergence of the first series in (10.21) implies that 


A, = D px | | — 0 
p=1 


7 


336 R. P. AGNEW [May 


as ko. But since a,,20 and 6,20, OS <A, for each fixed p. 


Hence 


Qta Q Qta 


Q Q 
pkSk| S = (a+ B) > A. — 0 


k=Q-a—f+1 p=Q-8+1 k=Q—a—S+1 


as Q—. This fact and (10.23) imply that 


“ 


k=1 


p=1 p=1 k=p—a 


exists and U,=V,. This argument shows that AB >A -B, and Theorem 10.2 


is proved. 
Examples show it is impossible to modify the argument to prove 
A-B2>AB. For one such example, it suffices to put, for each n=1, 2,---, 


an. = 0, k # n’,--- ’ 


(10.4 
k= 


and for each n=2, 3,--- 


= 
(10.5) 


= k=n—1,n, 
while b,,=1 and b,,=0 for k>1. The sequence defined by s, =(—1)*log n is 
summable AB to 0 and is non-summable A - B. We note in passing that (10.5) 
defines a regular Noérlund method of summability which is included by the 
arithmetic mean method. 

11. The arithmetic mean and generalizations. Let p, po, p3, - - - be a se- 
quence of constants with 


(11.01) m=1,2,---. 
Let P denote the method of summability associated with the transformation 
(P) Tn = (pisi + pose + paSn)/Prn. 


The transformations P differ from the more familiar Nérlund methods in 
order of distribution of the “weights” »,; but share with Nérlund methods 
the property of reducing to the important arithmetic mean method C, (or M) 
when p,=1 for each k. The theorems of this section therefore give facts in- 
volving Ci. 


THEOREM 11.1. If A and P (regular or not) are methods of summability 
with a,.20, P,>0, then AP2A-P. 


1938] PRODUCTS OF METHODS OF SUMMABILITY 337 


The AP and A-P transforms U, and V, of a sequence s; are (if they 
exist) determined as the sum by rows and the sum by columns of the double 


series obtained by removing parentheses from the series 
Pran(pisit O + 0 +04+---) 

+ + pose + pass +O+---) 


We show that existence of V, implies existence of U, and the equality 
U,=V,. This follows, on introducing obvious notation and interchanging 
rows and columns, from the following lemma: 


Lemma 11.2. If 6,20 forn=1,2, - - - , and the series obtained by removing 
parentheses from the series 


o1(0; + 62 +603; +---) 


(11.20) + o2( 2+ 63 + ) 


converges by rows to A, then it also converges by columns to A. 
To prove the lemma, let 
(11.21) Ra = On + On41 + 


If R,=0 for some n, then 0, =0 for all sufficiently great m, and the conclusion 
of the lemma obviously holds. Hence we may assume R,,~0 for all m. Let 


(11.22) On = On(On + + +° ++) = 
Then 
(11.23) =A, on = 
The sum of columns of the series (11.20) is 
(11.24) Kn = — Rati) + o2(Re — + on(Rn — 


This can be written 


(11.25) = 


where 
(11.26) = 1 — 


o3 


338 R. P. AGNEW [May 


The fact that R,—0 monotonely as n—© enables us to show that (11.25) 
defines a regular method of evaluating series,* that is, )\w, =A implies K,—A. 
This completes the proof of Lemma 11.2 and hence the proof of Theorem 11.1. 

It is impossible to strengthen Theorem 11.1 by proving that A-P> AP, 
even when P is the arithmetic mean transformation C;. In fact, we prove the 
following theorem: 


THEOREM 11.3. Corresponding to each transformation of the form 
(P) Tn = (pisi + pose +--+ + 


where p, Pn=pit --- for each n=1, 2,--- , there is a regular 
transformation 


(A) Sn > AnkSk 
k=1 
with an. 20, such that A - P does not include AP. 


Let pi, p2, - - - denote in order the odd primes. For each n= 1, 2, 3, - - - 
let 
0, k 2pn, 2p2,--- 


11.30 


The double series, of which U, is the sum by rows and V, the sum by col- 
umns, becomes, after removing rows of zeros and factoring the remaining 


TOWS 


where t =p,.. For each n=1, 2, - - - , let 
(11 32) Soe = 1/2°P «pore, 
and let s2,=0 when & is not of the form ,’. For each , let 


(11.33) = — 


We now have a complete and unique determination of a regular matrix a,, 
and a sequence s;. For each n, the series (11.31) converges by rows to 0 and 
fails to converge by columns. Therefore the sequence s; is summable AP to 0 
and is non-summable A - P. 


* Carmichael, Bulletin of the American Mathematical Society, vol. 25 (1918), pp. 97-131. 


1938] PRODUCTS OF METHODS OF SUMMABILITY 339 


12. Comparison of AB with A’B’ and A - B with A’-B’. It is trivially easy 
to see that if A, A’, B, B’ are regular transformations and there is an index 
mo such that an,.=a, and b,,.=b,, when n=mo, then A~A’ and B~B’. A 
comparison of the two methods AB and A’B’, or the two methods A - B and 
A’. B’ is not so simple. We can however prove the following theorem: 


THEOREM 12.1. Let 


(A), (A’) S, = » OnkSk = OnkSk y 


k=1 k=l 


(B), (B’) = bass, Te = disse, 
k=1 k=l 


be four regular methods of summability, and let an index N exist such that 
Then the two methods of summability 


(AB) 


(A’B’) 


U, = » bpkSk, 
p=1 


k=1 


p=1 


k=1 


are consistent, and the two methods 


(A-B) 


k=1 p=1 
o © 


= 


k=l p=1 


(A’- B’) 
are consistent. 

We prove first that (AB) and (A’B’) are consistent. Suppose s, is a se- 
quence summable AB to L and summable A’B’ to L’ so that U,—-L, U,! >L’. 
Then 

T,= bpkSk, T; = 
k=1 k=1 

must exist for each p=1, 2,---. Letting 0. denote generically quantities 
depending on a which converge to 0 as a, we find 


N-1 N-1 
p=1 p=N 


k=1 k=N 


On + > E + boas | 
p=N 


k=N 


= 0, + DpkSk- 
p=N 


k=N 


oO 

7 
| 


340 R. P. AGNEW 


Likewise 


UZ = On + > 


k=N 


But over the ranges of summation in this series, we have, when »2VN, 
and b),=b,,. Hence U,=0,+U,., and it follows that L=L’. Thus 
AB and A’B’ are consistent. 

To prove that A-B and A’-B’ are consistent, let s, be a sequence sum- 
mable A-B to A and summable A’B’ to A’ so that V,—<A, V,J--A’. Since 
=4np When n= N, we can write for n >N 


Vi = > >» + > 


k=1 p=N+1 


Since b;,=6,, when p>VN, it follows that when n= NV 


o N 


k=1 p=1 


An application of the following lemma with 6,,=(b,.—b,i )s, shows that 
V,—V,/—0 as n—© and hence that A=A’ and that A-B and A’-B’ are 
consistent. 


Lemma 12.2. If 
(12.21) 


and 


(12.22) An = Anpdpe 


k=l p=1 
exists for each sufficiently great n (say n= no), then 


(12.23) lim A, = 0. 


We prove this lemma by induction, considering first the case where N = 1 
and accordingly 


= > 


k=l 


[May 
lim dnp = 0, p=1,2,---,N, 
|| 
= WN 
n> mo. 


1938] PRODUCTS OF METHODS OF SUMMABILITY 341 


If a,:=0 for all sufficiently great m, then obviously A,—0. Otherwise we can 
choose m=» such that a,:+0 and conclude existence of 


B, dix, 


k=1 


so that 
An = 


Since a,:— 0, it follows that A,—0. Thus the lemma holds when N = 1. 
In case N =2, we have 


(12.24) An = (anidiz + 


k=l 


Suppose two multipliers 4; and we, not both zero, exist such that 


Midni + = O, n> No. 


Then supposing u2~0 (the case u:~9 is analogous) and putting p= —p1/ps 
we have 


Gn2 = Pani, n> No, 


and substitution in (12.24) gives 


A, = + pdex). 
k=1 
Now A,—0 follows from truth of the lemma for V =1. 
If multipliers 4; and ye. do not exist as above, then we can choose m and 
me such that mo m2 and 
Qn 


(12.25) ~0. 


From 
An, = (@n,151% + On, 252k) 
k=1 


k=1 
we conclude existence of 


k=1 


n= no. 


342 R. P. AGNEW [May 


This relation and (12.25) enable us to conclude convergence of the first, and 
similarly we conclude convergence of the second of the series 


B, = Dd sx, B, = Dd 
k=1 k=1 
Therefore we can write (12.24) in the form 
A, = Ani By + an2Be, 


and A,—0 follows from a,:—0, @,2—0. Thus the lemma holds when N =2. 
Assuming that the lemma holds when N <R, we can show that 


An = An pd pr — O 
k=1 p=1 
in the case where multipliers yi, we, - - - , we not all zero exist such that 
Midni + +++: + = O, n> No; 


and also in the alternative case where m,.<m.< --- <mp all exist greater 
than m» and such that the determinant 


det | an,s|, a,B=1,---,R, 


does not vanish. The methods are analogous to those of our proof for the 


case N =2. This completes the proof of Lemma 12.2 and hence of Theorem 
12.1. 

The reader may naturally be disgruntled by the conclusions of Theorem 
12.1; the theorem would be more satisfying if we could replace “consistent” 
by “equivalent,” but we cannot do this. It is clear that under the hypotheses 
of the theorem AB and A’B are equivalent; and A-B and A’-B are equiva- 
lent. But AB and AB’ (or A-B and A-B’) need not be equivalent. 

For an example, put a,:=1/n, n=1, @uan=1—1/n, n=2, 3,--- ; 
and a,.=0 when k #1, n. Let b,,=1/2? when k has the form 4p —3,and bi, =0 
otherwise. Let b/,=1/2? when k has the form 4p—1 and b},=0 otherwise. 
Let for each 


= = 1, k=n, 
= 0, otherwise. 
The sequence s,= [(—2)” when k=4p—1 and 0 otherwise | is summable AB 
and A -B to 0 but isnon-summable AB’ and A - B’. The sequence s,= [(—2)” 
when k=4p—3 and 0 otherwise | is summable AB’ and A - B’ to 0 but is non- 


summable AB and A-B. Thus AB and AB’ have overlapping convergence 
fields, as do A-Band A-B’. 


o R 


1938] PRODUCTS OF METHODS OF SUMMABILITY 343 


13. Multiple products. In terms of three methods A, B, C of summability, 
we can define five types of products: ABC, A(B-C) (A-B)C, A-(B-C) and 
(A -B)-C. It is easy to show that the last two methods are equivalent. It fol- 
lows easily from Theorem 7.1 that each other pair selected from the five 
products may be inconsistent, even though A, B, C are regular and a,, 20, 
bak =0, and Cnk =0. 

14. Kernel transformations. Just as matrices (a,,) serve to define gen- 
eralizations of lim,..5,, so also kernels a(x, #) serve to define generalizations 
of lim, ...5(¢). The -/4 transform of a function s(é) is, if it exists, given by 


(A) S(x) = fae, t)s(t)dt, 


and s(é) issummable to Lif S(x) 

Since the transformation <4 becomes essentially a matrix transformation 
of sequences when we put s(#)=s, when k—1<t<k and a(x, #)=a,, when 
n—1<Sx<n, k—1St<k, so that S(x) =S, when n—1<2x<u, it follows that 
some of our results have immediate application to kernel transformations. 
We mention only the fact that the iteration product 


(AB) U(x) = fae, t)s(t)dt 
0 0 


and the composition product 
A-B V(x) = ,)b(a, t)dt 
(1-8) (2) = sey 


may represent inconsistent methods of summability of functions, even though 
cA and B are regular and the kernels a(x, /) and b(x, ¢) are everywhere non- 
negative. It thus appears that a formal change of order of integration in a 
right member above may not only produce a meaningless integral but may 
actually produce a wrong answer. 


CorNELL UNIVERSITY, 
Irwaca, N. Y. 


4 
| 


THE GEOMETRY OF WHIRL SERIES* 


BY 
JOHN DE CICCO 


INTRODUCTION 


In this paper, we shall give some results in addition to those given in a 
paper, called The geometry of isogonal and equi-tangential series by Kasner.f 

We begin by considering certain simple operations or transformations on 
the oriented lineal elements of the plane. A turn T, converts each element 
into one having the same point and a direction making a fixed angle a with 
the original direction. By a slide S; the line of the element remains the same 
and the point moves along the line a fixed distance k. These transformations 
together generate a continuous group of three parameters which is called the 
group of whirl transformations.{ 

Applying a turn 7, to the tangential elements of a union produces an 
isogonal series while a slide S;, produces an equi-tangential series. When we 
apply a whirl to the tangential elements of a union we obtain a whirl series. 

Kasner has proved that (1) any element transformation which converts 
every isogonal series into an isogonal series must be the product of a conformal 
transformation by a turn. He also has proved that (2) any element transformation 
which carries every equi-tangential series into an equi-tangential series must be 
the product of an equi-long transformation by a magnification by a slide. 

We shall derive certain generalizations of the above results, two of which 
are the following: Any element transformation which carries every union into 
a whirl series must be the product of a contact transformation by a whirl. Any 
element transformation which converts every whirl series into a whirl series must 
be the product of a rigid motion by a magnification by a whirl. 

We also find that, for an arbitrary element transformation, the maximum 
number of unions, isogonal, equitangential and whirl series which are trans- 
formed into whirl series are respectively 2°, «7, 


I. THE DIFFERENTIAL EQUATION OF ALL WHIRL SERIES 


For the analytic representation of an element, it will be found convenient 
to use two systems of coordinates called the cartesian and the hessian co- 


* Presented to the Society, April 10, 1936; received by the editors April 16, 1937. 

¢ These Transactions, vol. 42 (1937), pp. 94-106. 

t Kasner, The group of turns and slides and the geometry of turbines, American Journal of Mathe- 
matics, vol. 33 (1911), pp. 193-202. 


344 


GEOMETRY OF WHIRL SERIES 345 


ordinate systems respectively. The cartesian coordinates of an element E are 
(x, y, 0) where (x, y) are the cartesian coordinates of the point of E and 
6=arc tan is the inclination of the line of Z. The hessian coordinates of an 
element E are (u, v, w) where » is the length of the perpendicular from the 
origin to the line of E, u is the angle between the perpendicular and the initial 
line, and w is the distance between the foot of the perpendicular and the 
point of £. 
The equations of the slide S; are 


U =4, V =2, W=wt+ek. 
The equations of the turn T, are 
U=u+a, V =vcosa+ wsina, =—vsina+weosa. 


Since any whirl transformation may be given in the form W=T7,S,T7,, it 
follows that the equations of the whirl W=T7T,S,T. are 


(1) V = vcos (a + 8) + wsin (a + + sin 8B, 
= — vsin (a + 8) + woos (a + B) + k cos B. 


From (1), it is found that any whirl series is given by the equations 


W = — a—§8) sin (a + B) + cos (a + B) + kos 8B, 


where (u, v(u), v’(u)) are the elements of the fundamental union, and a, k, 8 
are the constant parameters of the whirl transformation. 

THEOREM 1. In hessian coordinates the necessary and sufficient condition 
that the series V=V(U), W=W(U) be a whirl series is that the functions V and 
W satisfy the equation of third order 


d Vv" WwW’ 
(3) = 0 

dU 
In cartesian coordinates the necessary and sufficient condition that the series 
X=X(0), Y=Y(6) be a whirl series is that the functions X and Y satisfy the 
equation 


d {—X'+ Y’ — X’+Y’\? 
If in hessian coordinates, V=V(U), W=W(U) is a whirl series, then it 


may be given by the equations (2) which obviously satisfy (3). 
Let now the series V=V(U), W=W(U) be such that the functions V 


346 JOHN DE CICCO [May 


and W satisfy (3). By integrating (3) with respect to U and writing the con- 
stant of integration as tan (a+), we obtain the equation 


(V” — W’) cos (a2 + B) — (V’ + W”) sin (a2 + B) = O, 
which upon a second integration yields 
V’ cos (a + 8) — W’ sin (a + 8) = V sin (a2 + 8) + W cos (a2 + 8) — k cosa, 
where & is a constant. From this equation, it is seen that the equations 
u=U-a-B8, 
v = V cos (a + 8) — W sin (a + 8) + & sina, 
w= V sin (a+ 8) + W cos (a2 + 8) — k cosa 
represent a union. Solving them for U, V, W, and expressing V, W in terms 


of U, we are led to (2) and thus conclude that our given series is a whirl se- 
ries. The proof for cartesian coordinates is similar. 


CorOLLARY. In hessian coordinates, the necessary and sufficient condition 
that the series U=U(t), V=V(t), W=W(t) be a whirl series is that the func- 
tions U, V, and W satisfy the equation 

( 
dt\V + — 


(5) 


In cartesian coordinates, the necessary and sufficient condition that the series 
X=X(t), Y=Y (i), 6=0(t) be a whirl series is that the functions X, Y, and 0 
satisfy the equation 


dM 
(6) = 0(1+ 


where 
X + OV — Y Bu 
OXu — XOu + VO? 


M = 


II. UNIONS INTO WHIRL SERIES 


THEOREM 2. Under any element transformation, there exists in general a 
four-parameter family of unions which are transformed into whirl series. Any 
element transformation which converts every union into a whirl series must be 
the product of a contact transformation by a whirl. 


By any element transformation, the elements of a union 


u, v=vuu), (u), 


1938] GEOMETRY OF WHIRL SERIES 


become the elements 
U = ¢(u, 2, v’), V = Yu, », v’), W = x(u, 2, 
If this series is a whirl series, then by (5) we must have 


dp dp dx 


d|dudu® dudu?® du \du 
du | dy (2) 4 dp d*x 


du \du du du? du du? 


(7) 


Upon substituting the values of the derivatives of ¢, y, and x into (7), we 
have 
(8) d (= + Bo’ + Co’? + Dro’? + 
du\K + Lv” + Mo’? + No’? + 


where A, B,C, D, E, K, L, M, N, P are functions of u, 2, v’ only. 

Since (8) is a differential equation of the fourth order in 2, it follows that 
there is in general a four-parameter family of unions which under the above 
element transformation are carried into whirl series. All the different cases 
in the preceding paper have been discussed in a paper by Kasner and De 
Cicco called The classification of element transformations by isogonal and equi- 
tangential series, published in the Proceedings of the National Academy of 
Sciences, vol. 24 (1938), no. 1, pp. 34-38. The different cases of this paper 
will be considered in a later paper by the author. 

Our next problem is to determine all element transformations which con- 
vert any union into a whirl series. Then (8) is an identity, and the coefficient 
of v’’” in (8) is zero, whence 

A + Bo” + Co’? + Dv’ E 


K + Lo” + Mo’? + No’ P 


Since this relation is satisfied for every union, the ratios A/K, B/L, C/M, 
D/N, E/P have a common value A(x, 2, v’) and (8) becomes 


dx 
—=rA,+wvrA, + = 0. 
du 


Hence J is a constant, say tan a, and we obtain 


(9) 


347 
= = 0. 

K L M N P | 


348 JOHN DE CICCO [May 


Of the functions A, B, C, D, E, K, L, M, N, P, we shall need the explicit 

expressions for the following ones: 
A = (bu + + + 

— Wut (Guu + 20'buv + — (Xu + (Gu + 
K = (Wu + (Gu + + (Gu + + 20'Xuv + vv) 

— (xu + (Gun + + Vv»), 
D = borer — — 
N = + — 
E = bu (ou + — + 
P = xw(bu + — + 
We observe that our element transformation may be expressed as the 


product of some other transformation 


U = @(u, v, w), V = V(u, v, w), W = X(u, v, w), 


by the turn T,, where a is the constant angle of (9). Then we have 
¢= +a, = Vcosa+ X sina, x = —WVsina+ X cosa. 


It is clear that the equation E/P =tan a may be written as 


Yo Vu + vp. 
dv’ gu + 


= tana. 
Xv’ Xu + 


gu’ du + 
Substituting the equivalent expressions in terms of ®, ¥, X and simplifying, 
we obtain the relation 


(11) 


not containing a. 
It is readily seen that the equation D/N =tan a may be written as 


Vy 


av’ 
Substituting the equivalent expressions in terms of ®, ¥, X, we obtain even- 
tually the equation 


(= ) 
dv’ 
= tana. 
0 | Xv’ | 
(Vy 
dv’ 


1938] GEOMETRY OF WHIRL SERIES 


which upon integration with respect to v’ becomes 
(12) = X+ u(u, »), 


v’ 


where y is a function of u and 7 only. 
The equation A/K =tan a can be put into the form 


0 Yut vp» 
Ou \bu + 
0 Xu + 
du dv du + 
Substituting into this equation the equivalent expressions in terms of ©, ¥, X 
we obtain, on simplification 


From (11), (12), and (13), we have 


0 
(=. +2 v1) = put = 0. 
Ou ov 


Therefore yu is a constant k. 
We therefore conclude that the functions ®, V, X satisfy the equations 


(14) 


Hence it follows that our given transformation is the product of a contact 
transformation by a slide by a turn, that is, it is the product of a contact 
transformation by a whirl. That this condition is sufficient is obvious. 


III. EQUI-TANGENTIAL SERIES INTO WHIRL SERIES 


THEOREM 3. Under any element transformation, there exists in general a 
five-parameter family of equi-tangential series which are transformed into whirl 
series. Any contact transformation which carries every equi-tangential series into 
a whirl series must carry every equi-tangential series into an equi-tangential 
series; that is, it must be the product of an equi-long transformation by a magnifi- 
cation. Any element transformation which converts every equi-tangential series 
into a whirl series must be the product of an equi-long transformation by a 
magnification by a whirl. 


349 

= Vy 


350 JOHN DE CICCO [May 
By any element transformation the elements of an equi-tangential series 
u, v=v(u), w=v'(u)+k, become the elements 
U= o(u, w), V= w), W= x(u, w). 
If this series is a whirl series, then by (5) we have 


dp dy dx 


du du® dudu® du du 


du| dy 
du \du du du2 du du? 


Upon substituting the values of the derivatives of ¢, ¥, x into (15), we 
have 


(15) 


(16) 


d + Bw’ + Cw? + Dw? + 
du\K + Lw’ + Mw? + Nw? + Pw” 


where A, B, C, D, E, K, L, M, N, P are functions of u, v, w, k only. 

Since (16) is a differential equation of the fourth order in v, the complete 
solution of (16) contains four constants of integration and the constant & of 
our equi-tangential series. Therefore, there is in general a five-parameter 
family of equi-tangential series which under the above element transforma- 
tion are carried into whirl series. 

Our next problem is to determine all element transformations which con- 
vert any equi-tangential series into a whirl series. Then (16) is an identity, 


hence the coefficient of w’” is zero. Thus 
A+ Bw’+Cw?+Dw"? E 
K+Lw’+Mw?+Nw"? P 


Since this equation is satisfied for every equi-tangential series, the ratios 
A/K, B/L, C/M, D/N, E/P have a common value \(u, 2, w, k), and (16) be- 
comes 


dy 
+ (w— + wry = O; 
du 


that is, \ is a function of k only. Thus we obtain the result 


(17) A B Cc D E ACB) 
K L M N P 
where J is a function of & only. 
Now since a union is a special case of an equi-tangential series, it follows 


1938] GEOMETRY OF WHIRL SERIES 351 


by Theorem 2 that the transformation which we are seeking must be the 
product of a contact transformation by a whirl. It remains therefore to con- 
sider the contact transformations which carry any equi-tangential series into 
a whirl series. 

If \=0, it follows from (17) that the numerator of the fraction in (16), 
and hence that of the fraction in (15), vanishes for every equi-tangential 
series. Thus, the equation 


(18) 


must be identically satisfied. From (18) it follows that our contact transfor- 
mation converts every equi-tangential series into an equi-tangential series. 
Therefore according to Kasner’s result our contact transformation must be 
the product of an equi-long transformation by a magnification. 

We shall prove that there is no contact transformation with the desired 
property for which X is different from zero. In the first place, our contact 
transformation can not be an extended line transformation. For any such 
transformation with the desired property must carry every equi-tangential 
series into an equi-tangential series, and thus \ must be zero. Therefore for 
our contact transformation, we must have 


wy 


_ ve 
ou + wor Pw 


(19) x 
where and #0. 
The equation D/N =) has the form 
+ PwXww XwP ww 


Since \0, and the numerator of the fraction vanishes by (19), the de- 
nominator must vanish, and we have 


= . 


(20) 


From (19) and (20), we obtain 
¥ =a+6 sin (¢+ 7), 
x = bcos (¢+ 7), 


where a, 80, vy are functions of u and v only. 


(21) 


dy | 

= d| du 9 
du| dd 

du J 4 
| 

(v+*) 0 | 
aw buh 


352 JOHN DE CICCO 


The equation E/P =) may be written as 


Xwldu + (w— k)bo] — bwlxu + (w — k)xo] 


and hence, by (19), may be put in the form 


1 
—=—+Y, 


Xw(ou + wo») bw(Xu + WX») PwXv — Xwhv 
v Vwh ou» Vwhv 


Since the jacobian of the transformation cannot be zero, it follows by (19) 
that the common denominator of r and s, and also the numerator of s, are 
each different from zero. Hence, since ) is a function of k only, r and s¥0 
are finite real constants, independent of k. 

Substituting the values of y and x into the second of the equations (23), 
we obtain 


(By rBY v) cos (¢ + 7) (By» + rB») sin + 7) — ra, = 0. 


Since #0, it is seen that 6B, —rBy, =0, By. =0, ra, =0. Thus and 
¥ are functions of u only. 

Substituting the values of y and x into the first of equations (23) and 
remembering that 8 and vy are functions of u only, we obtain 


— B, cos (6 + y) + Bru sin (@ + y) — sa, = 0. 
Since $0, it is seen that 


Thus £8 and y are constants independent of k. Then by (21), x isa function of 
¢. Hence there is no contact transformation with the desired property for 
which d is not zero, and the proof of our theorem is complete. 


IV. ISOGONAL SERIES INTO WHIRL SERIES 


THEOREM 4. Under any element transformation, there exists in general a 
five-parameter family of isogonal series which are transformed into whirl series. 
Any contact transformation which carries every isogonal series into a whirl series 
must carry every isogonal series into an isogonal series, that is, it must be a 
conformal transformation. Any element transformation which converts every 


[May 
. 
where 
Bu 0, Bru 0, SQy = 0. 


1938] GEOMETRY OF WHIRL SERIES 353 


isogonal series into a whirl series must be the product of a conformal transfor- 
mation by a whirl. 


By any element transformation, the elements of an isogonal series 


£, y = y(x), arc tan p = arc tan y’(x) + arc tan k 
become the elements 
X = $(x, y, p), Y = (x, y, p), P = x(x, p). 


According to (6), this series is a whirl series if and only if 


df dx 
(24) 
dx dx 
where 
de (“y+ dp dy dy 
dx \dx 


2 
dx dx dx? +2(5) 
and @=arc tan x. 


Upon substituting the values of the derivatives of ¢, ¥, x into (25), we 
have 


(26) 


(25) 


A + Bp’ + Cp’? + Dp’? + Ep” 
K + Lp’ + Mp? + Np + Rp” 
where A, B, C, D, E, K, L, M, N, R are functions of x, y, p, k, only. 

Since, by (26), (24) is a differential equation of the fourth order in y, the 
complete solution of (24) contains four constants of integration and the con- 
stant k of our isogonal series. Thus there is in general a five-parameter family 
of isogonal series which under the above transformation are carried into whirl 
series. 

Our next problem is to determine all element transformations which con- 
vert any isogonal series into a whirl series. Then (24) is an identity, hence the 
coefficient of p’” is zero, whence 


(K + Lp’ + Mp? + Np’)E — (A + Bp’ + Cp? + Dp’)R = 0. 


Since this equation is also an identity, the ratios A/K, B/L,C/M,D/N, E/R 
have a common value A(x, y, p, k). Then obviously f =A, and from (24) we 
obtain 


(27) 


where a is a function of k only. 


A B Cc D E x +a 
K L M WN R 1 — ax 


354 JOHN DE CICCO [May 


Now since a union is a special case of an isogonal series, it follows by 
Theorem 2 that the transformation which we are seeking must be the product 
of a contact transformation by a whirl. It remains therefore to consider the 
contact transformations which carry any isogonal series into a whirl series. 

In the first place, let us suppose that our contact transformation is an 
extended point transformation. Then it must carry every isogonal series into 
an isogonal series, and therefore according to Kasner’s result, it must be a 
conformal transformation. 

Next let us suppose that our contact transformation carries every point 
into a line. Then it must be of the form 


(28) Set Phy 
y = fo — g, 
x=f;, 


where f and g are functions of x and y only. Then (28) must carry every 
isogonal series into an equi-tangential series. Thus the elements of the union 


y= yx), y = 
corresponding to the isogonal series 
x, y = y(x), arc tan p = arc tan y’(x) + arc tan k 
must be carried into the elements of the union 
X = 9), P=x(x,¥,y'/) 
corresponding to the equi-tangential series 
X = $(x, ¥, Y = ¥(x, y, p), P = x(x, p). 


If this series is an equi-tangential series, then 
d 
([o(x, p) — o(x, + ») — W(x, y’)]*) = 0. 


Upon simplifying this equation by means of (28) and setting the coefficient of 
p’ equal to zero, we obtain (since f.g,—f,g-~0) 


(29) [(1 + pk)fe+ (p — = + + 


Equation (29) is an identity; hence the coefficient of k? is zero, whence 


(pfs fy)? = (fz + pfy)?. 


1938] GEOMETRY OF WHIRL SERIES 355 


Since this equation is also an identity, we must have 
fe=tfhy, 


that is, f is a constant. By (28), this is impossible. We have, therefore, proved 
that there is no contact transformation which carries every point into a line 
and every isogonal series into a whirl series. 

Now we shall prove that there is no contact transformation with the de- 
sired property for which ¢,~0, y,~0, and x,0. For our contact transforma- 
tion we must have 


(30) x= 


The equation E/R =(x+a)/(1—ax) can be written in the form 
+ kp)xz + (pb — k)xv] — xal(l + + (p — bby] 1 — ax 


which, when solved for a and combined with (30) gives, 


(31) = +s, 
where 

(1 + x?) [oo(x2 + Pxy) + poy) | 
(32) xel— (py: = Vy) + x(poz | 


+ — xu) — (bbs — bv) + — 
— vy) + x(poz — $y) | 

Since the jacobian of the transformation can not be zero, it follows by (30) 
that the common denominator of r and s, and also the numerator of r, are 
each different from zero. Hence, since a is a function of k only, r~0 and s are 
finite real constants and are independent of k. From (31), it is then seen 
that 

The equation D/N =(x+a)/(1—ax) may be written in the form 
— + 2xvox? + (1+ — WpXpp) 

+ boxe + (i+ x?) pp pp) 1 — ax 


Solving this equation for a, we obtain 


(vp (op 
= (1 + 2x7)bp + + (1 + x’) (*) 


(33) Op \xp ap Xp 
xbp + (1 + + (1 + x?) |= (*) + x=(*)| 
OP \xp Op \xp 


Gz Poy op 


356 JOHN DE CICCO [May 


Since a0 and the numerator of the fraction vanishes by virtue of (30), 
the denominator must vanish, hence we obtain the relation 


a 
(34) 2xbp + < (2) +x (2) = 0. 
\Xp Op \Xp 


From (30) and (34), we then obtain 


where a, 8, and y 0 are functions of x and y only. 
In order that (35) be a contact transformation, we must have 
(36) Bz + pBy = x(az + pay) + + pry)(1 + 
From (32) and (35), we obtain 
(az + pay)(1 + x?) + (v2 + pry)x(1 + 
— By) — x(pas — ay) — — + x?)"? 
(pas — ay) + — By) 
(pBz — By) — x(paz — ay) — (prz — + x?)"? 
Since r <0, it follows from (36) and (37) that 


(a. + pay) + x(Bz+ pBy) 
(paz — ay) + x(pB x — By) 


where ¢ is a constant independent of &. Solving this equation for x, we obtain 


(az + pay) — t(paz — ay) ; 
(8. + pBy) + t(pBz By) 


It is observed that neither the numerator nor the denominator of the frac- 
tion in (38) can be zero. For, if either vanished, both must vanish and it 
would follow that a and £ are constants. But, then y would have to be a con- 
stant, by (36), and if a, 8, y were constants, ¢ and y, as given by (35), would 
be functions of x; and this is impossible. 

We shall prove that y cannot be a constant. For otherwise by (36) and 
(38) we would have 


(a. + pay) — Upaz — ay) ai Bz + pBy 
— (8. + pBy) + By) art pa, 


¢=at 
(35) 
¥=6 


(37) 


r 


(38) x= 


1938] GEOMETRY OF WHIRL SERIES 357 


Upon setting the term independent of p and the coefficient of p? each equal 
to zero, we obtain 


+ tay) = + tBy), 
ay(ay — tas) = By(— By + #82). 
Upon adding these equations, we find that a and @ are constants. This proves 


that y cannot be a constant. 
By (38) we see that x has an expression of the form 


a+ bp 

c+ dp 

where a, 5, c, and d are functions of x and y only. Also we must have 
(40) ad — bc £0. 


For otherwise by (35) and (39), ¢,-, x would be independent of p. 
Since 7 is not a constant, it follows by (36) and (39) that 
[(a + bp)? + (c + dp)?]"? 
c+dp 


is a rational function of p with coefficients which are functions of x and y 
only. Hence 


(39) x= 


(1 + x2)'2 = 


[(a + + (c + 


must be a perfect square with respect to the letter p. But this can only happen 
when ad—bc =0. By (40) this is impossible. Therefore we have proved that 
there is no contact transformation with the desired property for which ¢, 0, 
y,~0, and x,#0. This completes the proof of our theorem. 


V. WHIRL SERIES INTO WHIRL SERIES 


THEOREM 5. Under any element transformation, there exists in general a 
seven-parameter family of whirl series which are transformed into whirl series. 
Any contact transformation which carries every whirl series into a whirl series 
must be the product of a rigid motion by a magnification. Any element transfor- 
mation which converts every whirl series into a whirl series must be the product 
of a rigid motion by a magnification by a whirl. 


By any element transformation, the elements of a whirl series 
u, v(m), v= w(u), 
become the elements 


U = w), V = », w), W = x(u. w). 


358 JOHN DE CICCO 


If this series is a whirl series, then by (5), we must have 


(dp dp dy . dx 


d | du du® dudu® du\du 
(41) 


du| dy (2) 
du \du du du® du du? 


Now (41) obviously contains w’’”’. Since w is a combination of the elements 
v=v(u), v’ =v'(u) of the fundamental union, it follows that (41) is a differ- 
ential equation of the fourth order in v. Hence the complete solution of (41) 
contains four constants of integration and the three parameters of our whirl 
series. Thus there is in general a seven-parameter family of whirl series which 
are carried into whirl series. The remainder of the theorem follows from 
Theorems 3 and 4. 


THEOREM 6. There is no transformation which carries every isogonal series 
into an equi-tangential series, or every equi-tangential series into an isogonal 


series. 
This is an immediate consequence of Theorems 3 and 4. 


CoLuMBIA UNIVERSITY, 
New York, N. Y. 


AN EXTENSION OF SCHWARZ’S LEMMA* 


BY 
LARS V. AHLFORS 


I. THE FUNDAMENTAL INEQUALITY 


1. To every neighborhood on a Riemann surface there is given a map onto 
a region of the complex plane. For any two overlapping neighborhoods the 
corresponding maps are directly conformal.t We agree to denote points on 
the surface by w, corresponding values of the local complex parameter by w. 
We introduce a Riemannian metric of the form 


(1) ds = dw], 


where the positive function \ is supposed to depend on the particular parame- 
ter chosen, in such a way that ds becomes invariant. The metric is regular if \ 
is of class C2. In this paper we shall, without mentioning it further, allow \ to 
become zero, although such points are of course singularities of the metric. 

It is well known that the Gaussian curvature of the metric (1) is given by 


(2) K = — logi, 

and that this expression remains invariant under conformal mappings of the 
w-plane. We are interested in the case of a metric with negative curvature, 
bounded away from zero. It is convenient to choose the upper bound of the 
curvature equal to —4. From (2) it follows that the corresponding d satisfies 
the condition 


(3) A log \ = 4)?. 
When we set u =log X this is equivalent to 


(4) Au = 


The hyperbolic metric of the unit circle |z] <1 is defined by 
(5) do = (1 —|2|?)"| 


and has the constant curvature —4. 

2. Consider now an analytic function w=f(z) from the circle |z| <1 toa 
Riemann surface W. The analyticity is expressed by the fact that every local 
parameter w is an analytic function of z. To a differential element dz corre- 
sponds an element dw whose length does not depend on the direction of dz. 
The corresponding value of ds=\|dw| =,|dz| is therefore uniquely de- 

* Presented to the Society, September 8, 1937; received by the editors April 1, 1937. 


t For the definition of a Riemann surface see T. Rad6, Uber den Begriff der Riemannschen Fliche, 
Acta Szeged, vol. 2 (1925). 


359 


= 

| 


360 L. V. AHLFORS [May 


termined, and we have \,=A|w’(z)|. It is also seen that u=log X, satisfies 
the condition (4) whenever the given metric has a curvature < —4. An ex- 
ception has to be made for the possible zeros of \,, corresponding to the zeros 
of \ and w’(z). 


TueoreM A. If the function w=f(z) is analytic in |z| <1, and if the metric 
(1) of W has a negative curvature < —4 at every point, then the inequality 


(6) ds S de 


will hold throughout the circle. 


Proof: Choose an arbitrary R <1 and set »=log R(R?—|2|?)— for | z| <R. 
We note that Av =4e”" and consequently 


(7) A(u — v) 2 — 


Let us denote by E the open point set in |z| <R for which u>v. It is 
clear that Z cannot contain any zeros of \,. Hence (7) is valid and shows that 
“u—v is subharmonic in E£. It follows that w—v can have no maximum in E£ 
and must approach its least upper bound on a sequence tending to the 
boundary of £. But E can have no boundary points on |z| = R, for v becomes 
positively infinite.as z tends to that circle, and at interior boundary points 
we must have u—v=0, by continuity. A contradiction is thus obtained, un- 
less E is vacuous. The inequality «<v consequently subsists for all points 
with |z| <R, and letting R tend to 1 we find u < —log (1 —|2| *) at all points. 
This is equivalent to (6). 

If W is the unit circle and ds its hyperbolic metric, Theorem A is simply 
the differential form of Schwarz’s lemma given by Pick.* 

3. Several generalizations of the theorem just proved suggest themselves 
at once. Since the only thing we need is to prevent the function «—v from 
having a maximum in £, it is obvious that the assumptions on A can be con- 
siderably weakened, without affecting the validity of the argument. We shall 
give below two such generalizations which are found to be particularly useful 
for the applications. 


THEOREM Al. Let d be continuous and such that at every point, either 
(a) the second derivatives of u=log \ are continuous and satisfy (4), or (b) it 
is possible to find two opposite directions n’,n"’ for which du/dn’ +0u/dn"’ >0. 
Then the statement of the previous theorem is still true. 


Opposite directions in the w-plane correspond to opposite directions in the 
z-plane. At a maximum of u —v we have 0u/dn <0v/0n in any direction, when- 


* An account of all questions related to Schwarz’s lemma will be found in R. Nevanlinna, 
Eindeutige analytische Funktionen, Springer, 1936, pp. 45-58. 


1938] EXTENSION OF SCHWARZ’S LEMMA 361 


ever the directional derivative exists. For opposite directions dv/dn’+00/dn’’ 
=0; hence du/dn’+0u/dn’’ <0 in case of a maximum. It follows that no 
maximum can be attained in points satisfying condition (b). 

We shall call ds’ =’|dw| a supporting metric of ds =| dw| at the point 
wo if: (1) A’ =X at wo, (2) d’ is defined and Sd in a neighborhood of w». 


THEOREM A2. Suppose that d is continuous, and that it is possible to find a 
supporting metric, satisfying (4), at every point of W. Then the inequality (6) 
still holds. 


If u—v>0 at 2, then u’—v will also be positive, and consequently sub- 
harmonic, in a neighborhood of z9.* A maximum of u—v will a fortiori be a 
maximum of u’—v. Hence u—» can have no maximum in E. 

II. ScHoTTKy’s THEOREM 


4. As a first application we proye Schottky’s theorem with definite nu- 
merical bounds. 


TueoreM B. /f f(z) is analytic and different from 0 and 1 in |z| <1, then 


1 6 
(8) log | f(z)| < + log | #0) |) 


for |z| <0<1.f 


Let {:={:(w) map the region outside of the segment (0, 1) onto the ex- 
terior of the unit circle, so that w= corresponds to {:=”,w=1 tog,=1, 
and w=0 to —1. We also set {2(w) ={:(w~) and ¢3(w) =f2(1 —w). Clearly 
these functions define similar maps of the regions outside of the segments 
(1, ©) and (— ©, 0). Explicitly, ¢:(w) is obtained from the equation 
(9) i+ cr’ = 4w — 2. 

We introduce the coordinates p:=|w|, p2=|w—1| and divide the plane 
into regions px=1, p2=1; Qe: pi p1 Qs: p2S1, p2Spi. The metric 

| d log 


(10) ds; = = ,| dw 
2(4 + log | | ) | | 


* u’ corresponds to X’ as # to X. 

+ Schottky’s original theorem was purely qualitative. Numerical relations have been studied at 
great length, notably by Ostrowski (Studien tiber den Schottky’schen Satz, Basel, 1931, and Asymp- 
totische Abschatzung des absoluten Betrags einer Funktion, die die Werte 0 und 1 nicht annimmt, Com- 
mentarii Mathematici Helvetici, vol. 5 (1933)), but no simple inequality comparable with (8) has 
ever been proved. 

Added in proof: Numerical bounds of the same order of magnitude are found by A. Pfluger, 
Uber numerische Schranken im Schottky’schen Satz, Commentarii Mathematici Helvetici, vol. 7 
(1935). His proof depends on the use of modular functions, while ours is strictly elementary. 


q 
i 


362 L. V. AHLFORS [May 


is readily recognized as the hyperbolic metric of a half-plane with the con- 
stant curvature —4. Computing the derivatives ¢/ (w) we find 


Ar! = 2(p1p2)"/2(4 + log | o1| ); 
(11) Az! = 2pips!!2(4 + log | f2| ), 
Az? = + log | ). 

We now set ds =A| dw| with \=X, in Q;. This metric is regular and satis- 
fies condition (3) except at the singular points 0, 1, © and on the lines sepa- 
rating the regions Q;. On these lines ) is still continuous, as seen from (11) 
and the relations between {;, 2, and 3. 

Next we wish to show that condition (b) in Theorem A1 holds on the 
singular lines. We consider the arc p: =1, p2>1 and choose n’, n’’ as the outer 
and inner normals of the circle. The required condition is 

0 
on’ 


log + : 
oO 
on” 


From (11) we obtain 


4+ log | 
which is also equal to 


2 
where ®, =arg {;, ¢ =arg w. For ® we have the simple relation cos ®; = p: — po, 
which for p:=1 becomes cos ®;=1—2 sin ¢/2. Differentiating we find 


1/2 
1 
( + csc 


and by use of the inequalities /3 <@ <57/3, || >1, we are finally led to 
the desired result, 


31/2 
—>—-— >0. 

2 4 
By symmetry, the same must be true for the arc p2=1, p: >1. The trans- 
formation w’ = (1—w)- takes Q, into Q and Q into Q;. Since the function d is 
invariant under the transformation we conclude at once that condition (b) 

will hold also on the line separating 2 and Qs. 
From Theorem A1 we can now conclude that w=f(z) satisfies the differ- 


| 
| 
1 on’ 
Ai 
0 
ad 


1938] EXTENSION OF SCHWARZ’S LEMMA 363 


ential inequality \| dw| <(1—| dz| . Integrating, we find that the short- 
est distance between the points f(0) and f(z), |z| =0, measured in the metric 
ds =i| dw|, cannot exceed [log (1+6)/(1—6) ]/2. 

The shortest path between the circles p: =m and p,= M, where M>m22, 
is a segment of the negative real axis, whose length is found to be 


4+ log | m)| 
To simplify we introduce the lower and upper bounds |f,(—M)| 24M, 


|1(—m)| <5m. Setting M =| f(z)| and m equal to the greater of the numbers 
|f(0)| and 2 we obtain 


1+ 60 
4+ log 4 + leg Sm). 


Here log 5m Slog 10 | f(0)| <3+dg | f(0)| and we find 


1+ 0 + 
4+ log 4M <——— (7 + log | f(0)| ) 
which is stronger than (8). 
III. BLocn’s THEOREM 


5. Let w=f(z) be analytic in |z| <1 with |/’(0)| =1. Let B’=B’(f) be 
the l.u.b. of the radii of all simple (schlicht) circles contained in the Riemann 
surface W generated by f(z). Bloch’s theorem is B=min B’>0. Landau has 
proved B>.396.* Grunsky and Ahlfors proved in a recent paper B <.472.T 

We show that the method developed in this paper gives an immediate 
proof of Bloch’s theorem with a better lower bound for B. For an arbitrary 
point w on W let p(w) denote the radius of the largest simple circle of center w 
contained in W. It is clear that p(w) is continuous, and equal to zero only 
at the branch-points. We introduce the metric ds =| dw| with 


A 


A= — p) (o = p(tv)) 


(12) 
and w denoting the variable of the function plane (not the uniformizing 
variable). A is a constant satisfying the preliminary condition A?>B’. 
In the neighborhood of a branch-point a we have p=|w-—a|. Let be 
the multiplicity of a; then w:=(w—a)"/" is a uniformizing variable, and 
* E. Landau, Uber den Blochschen Satz und zwei verwandte Weltkonstanten, Mathematische 


Zeitschrift, vol. 30 (1929). 
+ L. V. Ahlfors and H. Grunsky, Uber die Blochsche K onstante, Mathematische Zeitschrift, vol. 42 


(1937). The result was found independently by R. M. Robertson. 


= 
= 


364 L. V. AHLFORS 


the corresponding \; is determined from },|dw,| =A|dw|. We obtain 
di =mp'/?-1/n/2(A?—p), and it is seen at once that the metric is regular 
in case m=2 and that A, becomes zero in case n>2. 

We wish to apply Theorem A2 and therefore look for a supporting metric 
satisfying the requirements of that theorem. For a regular point wp» the sur- 
rounding circle of radius p(t) must pass through at least one singularity b 
which is either a branch-point or a boundary point for the surface. We set 
p’=|w—b| and define \’ = A/[2p’!/2(A?—p’) |. This metric has the curvature 
—4 for it is obtained from the hyperbolic metric of a circle by means of the 
transformation w’ =w'/*. In all points of our circle we have p Sp’ by the defi- 
nition of p. The inequality \’ <) is therefore satisfied in a neighborhood of tw» 
if the function ¢'/?(A?—2) increases for ¢< p(w). Under this condition X’ will 
be a supporting function of X\, for at the center wo we have \’ =X. The function 
t!/2(4*—?) is increasing as long as t<.A?/3. Consequently all the conditions 
in Theorem A2 are fulfilled if we suppose that A?>3B’. 

Apply the theorem with z=0. Using the condition | dw/dz|..o=1 we get 


(13) A 2pol!/?(A? “= po), 


where pp is the radius of the largest simple circle with center at the image of 
z=0. The function in the right member of (13) is increasing, and we can re- 
place po by B’ obtaining A <2B’/*(A*—B’). Letting A tend to (3B’)!/? we 
finally get B’ => 3"/?/4. This implies that Bloch’s constant B= 31/?/4> .433. 

On the other side, if we insert A? = (3B’)*/? in (13), lower and upper bounds 
for po in terms of B’ can be found. 

6. Landau has considered a closely related constant L. Let L’=L’(f) be 
the l.u.b. of the radii of all circles in the w-plane contained in the projection 
of W, that is, whose values are taken by the function w=f(z), |f’(0)| =1. L 
is defined as the minimum of all such L’. Clearly, L2 B. 

The method employed above is immediately applicable if we choose 
A= (2p log C/p)-'. This metric is regular at all branch-points, and when we 
replace p by the distance p’ from a fixed boundary point, the curvature be- 
comes —4. In order that the function \’ thus obtained be a supporting 
function it is sufficient that ¢ log C/t is increasing. This is true for t<Ce-. 
We therefore choose C>eL’, obtaining the inequality 1<2L’ log C/L’ 
as above. Letting C tend to eL’ we find L’>1/2 and hence L=1/2. 

This lower bound is the best known. It shows in particular that L>B.* 

HARVARD UNIVERSITY, 

CAMBRIDGE, Mass. 

* In the other direction R. M. Robinson has proved L.<.544. This result has not been pub- 

lished. 


NORMALITY AND ABNORMALITY IN THE CALCULUS 
OF VARIATIONS* 


BY 
G. A. BLISS 


Within the past few years a number of papers concerning the problem of 
Bolza in the calculus of variations have been published which make it possible 
to carry through the theory of this problem with much simplified assumptions 
concerning what is called the normality of the minimizing arc. I refer es- 
pecially to papers by Graves [8],t Hestenes [11, 14, 16], Reid [15], and 
Morse [13]. These papers and others are also important because they bring 
the theory of problems of the calculus of variations with variable end points 
to a stage comparable with that already attained for the more special case 
in which the end points are fixed. 

In the theories of Bolza [1, chap. 11, 12] and Bliss [2] for the problem 
of Lagrange with fixed end points it was assumed that the minimizing arc 
considered, extended slightly at both ends, was normal on every sub-interval. 
Morse [4] showed that the theory could be carried through on the assumption 
that the arc itself, without extensions, was normal on every sub-interval. The 
most important case, however, turns out to be the one for which the arc as a 
whole is normal relative to the problem considered, but not necessarily nor- 
mal on sub-intervals. Graves proved the necessary condition of Weierstrass 
for such a normal minimizing arc, and Hestenes deduced further necessary 
conditions and gave sufficiency proofs for a minimum. The importance of 
these results is emphasized by the fact that for the very general problem of 
Mayer, which may be regarded as a sub-case of the problem of Bolza, every 
minimizing arc is abnormal on every sub-interval, even though the arc as a 
whole is normal relative to the problem. Thus the problem of Mayer needs a 
separate treatment, such as was given by Bliss and Hestenes [9, 10], unless 
one has at his command results equivalent to the recent extensions of the 
theory of the problem of Bolza mentioned above. 

In this paper I am attempting to analyze, more explicitly than has been 
done before, the meaning of normality and abnormality for the calculus of 
variations. To do this I have emphasized in §1 below the meaning of normal- 
ity for the problem of a relative minimum of a function of a finite number 
of variables. In §2 analogous notions are discussed for problems of the cal- 


* Presented to the Society, April 20, 1935; received by the editors April 1, 1937. 
{ The numbers in brackets here and elsewhere refer to the bibliography at the end of this paper. 


365 


366 G. A. BLISS [May 


culus of variations. From this discussion it will be clear that a normal arc 
for the problem of Bolza is a non-singular arc of the class in which a minimiz- 
ing one is sought. The singular arcs of the class are the abnormal ones. They 
have an enormous variety of types. It is not likely that a general theory can 
be formulated which would apply to all of them, though one might character- 
ize and study successfully some very general cases. 

In the papers of Graves and Hestenes mentioned above there is no explicit 
assumption concerning normality. The arc studied is assumed only to have a 
set of multipliers like those which it would have if it were normal for the 
problem of Bolza considered. In the following pages it will be seen that, 
though such an arc may be abnormal for the problem originally considered, 
it is nevertheless normal for a second problem of Bolza obtained from the 
first by suitably extending the class of arcs in which a minimizing one is 
sought. Furthermore the properties characterizing a minimizing arc for the 
original problem are effective for the second, so that the sufficiency theorems 
of Hestenes for arcs which are normal have as easy consequences those for 
the abnormal arcs permitted by his hypotheses. This makes possible a num- 
ber of simplifications in the details of the proofs. It is not to be expected, of 
course, that new necessary conditions on a minimizing arc can be secured by 
extending the class of arcs in which a minimizing one is sought. The paper of 
Graves, therefore, seems to contain results not attainable by considering only 
normal arcs. 

In the introduction to his paper [13] Morse makes a statement concerning 
priority for the proofs of sufficiency theorems without assumptions of normal- 
ity which might easily be misunderstood and about which I should like to 
make the following comments. Hestenes had previously proved, in his paper 
[11], three sufficiency theorems (Theorems 9:1, 9:3, 9:5) without explicit 
assumptions of normality, and also a fourth theorem (Theorem 9:4) with 
normality assumptions still undesirably strong, but weaker than those which 
had before been used. Reid [15] and Morse [13] showed independently that 
by means of a further lemma, but aided still essentially by the results of 
Hestenes, this fourth theorem can be brought to a par with the others. The 
condition VI’ [11, p. 811] of Theorem 9:4 is analogous to one which I used 
in the paper [5], and which was originally due to A. Mayer. Its statement 

involves the notion of conjugate points and is therefore more closely related 
to the classical conditions of Jacobi for simpler problems than the corre- 
sponding conditions of the other theorems. I think it should be understood 
that the priority comment of Morse is applicable to Theorem 9:4 of Hestenes, 
but not to the other three theorems of his paper, which are equally important. 
I may add that the theorems of Hestenes were proved with great originality 


1938] CALCULUS OF VARIATIONS 367 


and ingenuity while he was my research assistant at the University of Chicago 
in 1933 [16, p. 543]. When he went away he left a manuscript with me in 
which the theorems were, at my suggestion, deduced only for normal arcs, 
the ones which then, as well as now, seemed to me the most important, even 
though the justification of the arguments of the present paper was at that 
time missing. This manuscript has since appeared in much modified form in 
my mimeographed lectures on the problem of Bolza [12]. In his paper [11] 
Hestenes showed that his methods are also effective for the problem of Bolza 
in the form adopted by Morse. 

1. Abnormality for minima of functions of a finite number of variables. 
The significance of the notion of abnormality in the calculus of variations 
can be indicated by a study of the theory of the simpler problem of finding, 
in the set of points y=(y1, - - - , Yn) satisfying a system of equations of the 
form 


ga(y) =0 (6=1, ,m<mn), 


one which minimizes a function f(y). For a point y°=(y°,---, y2) near 
which the functions f and ¢s have continuous partial derivatives of at least 
the second order, and which satisfies the equations ¢s=0, we have the fol- 
lowing theorems, some of which are, of course, well known. 


THEOREM 1:1. A first necessary condition for f(y°) to be a minimum is that 


there exist constants lo, ls not all zero such that the derivatives F ,, of the function 


F=lf+lads 


all vanish at y°. 
To prove this we have only to note that the determinants of the matrix 


must all vanish. Otherwise, according to well known implicit function theo- 
rems, the equations f(y) =f(y°) +, ¢s(y) =0 would have solutions y for nega- 
tive values of u, and f(y°) could not be a minimum. 

A point y° has by definition order of abnormality equal to q if there exist q 
linearly independent sets of multipliers of the form J) =0, /s having the prop- 
erty of the theorem. When q=0 the point y’ is said to be normal. A necessary 
and sufficient condition for abnormality of order q is evidently that the matrix 
\|sy,(y°)|| have rank m—g. At a normal point y® the multipliers Jo, Js of the 
theorem can be divided by /, and put into the form /) =1, Js. In this form they 
are unique, since the non-vanishing difference of two such sets would be a set 
of multipliers implying abnormality. 


368 G. A. BLISS [May 


LemMA 1:1. Jf a point y° is normal, then for every set of constants n; 
(i=1,--- , ) satisfying the equations 


(1:1) = 0 
there exists a set of functions y,(b) having continuous second derivatives near 
b=0, satisfying the equations ¢g3=0, and such that 
The proof can be made by considering the equations 
(1:2) os(y) = 0, = + OS, (B= 1,---, =m+1,---, mn) 


in which the auxiliary functions ¢,(y) are selected so that they have con- 
tinuous second derivatives near y® and make the functional determinant 
|diye(y°)| different from zero, and in which the constants ¢, are defined by 
the equations 


Equations (1:2) then have solutions y,(b) with continuous derivatives of at 
least the second order near 6=0, and such that y,(0) =. By differentiating 
with respect to b the equations (1:2) with these solutions substituted, we find 
the equations 


yi (0) = ¢,. 


With equations (1:1) and (1:3) these show that y/ (0) =. 
THEOREM 1:2. Jf y® is a normal point and f(y°) a minimum then the con- 
dition 


2 
must hold for every set n; satisfying the equations (1:1), where F =f+lgz is the 
function formed with the unique set of multipliers 1)=1, ls belonging to y”. 


The conclusion of the theorem is due to the fact that the function 
g(b) =f[y(b) |, formed with the functions y,(b) of the lemma, must have a 
minimum at )=0. Since 


dav; ] yi (6) = 0 


the derivatives of g(b) are seen to have the values 


CALCULUS OF VARIATIONS 
g'(b) = ful (6) = 
= 
and for g(0) to be a minimum we must have g’’(0) =0. 


THEOREM 1:3. If a point y° has a set of multipliers 1,=1, ls for which the 
function F =f+ls$, satisfies the conditions 


(1:4) = > 0 
for all sets n; satisfying the equations 

(1:5) bay, (¥°)ni = 0, 

then f(y°) is a minimum. 


This can be proved with the help of Taylor’s formula with integral form 
of remainder. For every point y near y® satisfying the equations ¢;=0 we 
have the equations 


$09) = = ful f= 


1 
O = dpy,(¥°)ni +f (1 — 
0 


1 
0= f 
0 
where y/ = y +0(y:—y?), n:=yi—y#. From these we find readily 
1 
f(y) — f(y’) = f (1 — 0)Fyiv.(y’)ninedd. 
0 


Since the quadratic form in the integrand of the last integral, thought of 
as a function of independent variables y’ and 7, is positive for y’ =~y° 
and all sets 7 satisfying the equations (1:5), it will remain positive for 
y’=y°+0(y—y’) and all sets including satisfying equations 
(1:6), provided that y lies in a sufficiently small neighborhood WN of the 
point y°. Thus we see that for all points y in NW satisfying the equations 
¢s=0 the difference f(y) —f(y°) is positive. 

The last theorem is analogous to the sufficiency theorems of Hestenes in 
the calculus of variations. In it there is no explicit assumption concerning 
the normality or abnormality of the point y°. If y° has abnormality of order g, 
however, let v be a variable which ranges over a subset of m—g of the num- 
bers 1, - - - , msuch that the matrix ||¢,,,(y°)|| has rank m—g, and let p range 
over the complementary subset. Then we have the following theorem: 


370 G. A. BLISS [May 


THEOREM 1:4. Let y® be a point which satisfies the hypotheses of Theorem 
1:3 with a set of multipliers l,=1, 1s, and let v and p be variables having the 
ranges described in the last paragraph. Then y°® is normal for the modified prob- 
lem of minimizing the function g=f+1,o, in the class of points y satisfying the 
restricted system of equations @,=0, and y° satisfies the hypotheses of Theorem 
1:3 for the modified problem with the multipliers 1,=1,1,. Furthermore if g(y°) 
is a minimum for the modified problem, then f(y°) is a minimum for the original 
one. 


We see that the point y° is normal for the modified problem, since the 
matrix ||¢,,,(y°)|| has rank m—g. For the function F = g+J,¢, = f+ls¢s of the 
modified problem the conditions (1:4) are satisfied for all sets 7 satisfying the 
equations 


(1:7) = 0, 


since equations (1:5) are linear and have a matrix of coefficients of rank 
m—qg and hence are consequences of equations (1:7). The set of points y 
satisfying the equations ¢,=0 includes the points satisfying the complete 
system ¢;=0 as a subclass in which g =f. Hence if g(y°) is a minimum for the 
modified problem, the value f(y°)=g(y°) must have the same property for 
the original problem. 

From the last theorem it is evident that generality is not lost by proving 
Theorem 1:3 only for points y® which are normal. Such points are, in fact, 
the non-singular points of the class which satisfy the equations ¢, =0. Near 
each of them there are infinitely many points of the class, as is shown by 
Lemma 1.1, and the minimum problem near one of them is therefore never 
trivial. Abnormal points, on the other hand, are the singular points of the 
class, and may occur in a wide variety of types. For some of these points 
the minimum problem is trivial, as, for example, in the case of a point y’, 
for which ¢,=0, which minimizes the function ¢; in the class of points y 
satisfying the equations ¢2= - - - =¢,=0. Near such a point y® there is no 
other point satisfying all of the equations ¢, =0. 

An idea of the great variety of types of abnormal points may be gained 
by considering the problem of minimizing a function f(y:, ye) of two variables 
in the class of points (y:, ye) satisfying a single equation ¢(¥y:, yz) =0. The 
variety of abnormal points possible in this case is at least as great as the 
variety of singular points of an algebraic curve. The particular example 
f=2y? —y?, 6=y? ¥2—y22=0, with minimizing point (0, 0), shows that the 
condition involving the quadratic form in Theorem 1:3 is not in general 
necessary for a minimum. 


1938] CALCULUS OF VARIATIONS 371 
2. Abnormality in the calculus of variations. The problem to be consid- 
ered in this section [12, p. 4] is that of finding in a class of arcs 
(2:1) yi(x) 
satisfying conditions of the form 
a(x, = 0 <n), 
y(41), y(a2)] = 0 i,--+, 2) 


one which minimizes a sum 


A set of values (x, y, y’) and end values [x., yi. | = [x., vi(x.) ] (s=1, 2) is said 
to be admissible if it lies interior to a region of such values in which the func- 
tions f, g, ds, ¥, have continuous derivatives of at least the fourth order, and 
in which the matrix ||¢,,/|| and the matrix of first derivatives of the functions 
y, have ranks m and , respectively. An admissible arc C defined by functions 
of the form (2:1) is one which is continuous and consists of a finite number of 
sub-arcs with continuously turning tangents, and whose elements (x, y, y’) 
and end values are admissible. When convenient we may represent by J(C), 
g(C), ¥,(C) the values of these functions determined by the arc C. 

The conditions involved in the sufficiency theorems for this problem are 
the following, the numbering being that which I have often used [see, e.g., 12, 
chap. 3]: 

I. THE MULTIPLIER RULE. A set of multipliers /o, /s(x), e, for an admissible 
arc E is a set for which the J, e, are constants and the functions /,(x), defined 
on the interval x;x2 belonging to EZ, are continuous except possibly at values 
of x defining corners of E at which they nevertheless have well-defined for- 
ward and backward limits. The arc E satisfies the multiplier rule if there exist 
constants c; and multipliers Jo, /s(x), e, such that for F =/of+1s(x)@s the equa- 
tions 

f Fy ax + Ci, dg = 0 
are satisfied along E, and furthermore such that the end values of E satisfy 
the equations 


(2:2) [@ yfFy)dx + + + Gd, = 0, = 0 


identically in the differentials dx,, dyis. 


372 G. A. BLISS [May 


It has been proved [12, p. 27] that the identically vanishing set of multi- 
pliers is the only set having constants Jo, e, all zero, or having functions 
lo, s(x) which vanish simultaneously at some value x on the interval x2. 


IIx. An admissible arc E satisfies the strengthened condition of Weier- 
strass if for every set of the type (x, y, y’, J) in a neighborhood N of those 
belonging to E the inequality 


E(x, y, 9’, 1, > 0 
is satisfied for all admissible sets (x, y, VY’) ¥(x, y, y’), where 
E= F(x, af ) F(x, (Y? yi y’, I). 


III’. An admissible arc E satisfies the strengthened condition of Clebsch 
if at every element (x, y, y’, /) belonging to E, the inequality 


9’; > 0 
is satisfied for all non-vanishing sets 7; satisfying the equations 
(x, = 0. 


If we represent by q, g, the quadratic forms in dx,, dy;, whose coefficients 
are the second derivatives of the functions g, y,, respectively, the second 
variation of J for an extremal arc E with multipliers /)=1, Js(x), e, has the 
value 


J2(§, n) 2y n(%1), £2, n(x2) | + f n; n’)dx 


in which 
= Py + 2Fy + ne 


2y = [(F. — yfF,,)dx + + 29 + 260, 


with dx, dy; replaced by £, y/ +7; [12, p. 71]. The equations of variation 
along £ are the equations 


(2:3) n’) = 0, n(%1), fe, n(x2) | = 0 
in which 
Ds = dpy.ni + , 


and W, is dy, with dx, dy; replaced as above by &é, y/&+m; [12, p. 14]. An 
admissible set £1, £2, ni(x) is one for which &, & are constants and the functions 
ni(x) have on x\x2 the continuity properties of an admissible arc y;(x). The 


1938] CALCULUS OF VARIATIONS 373 


second variation J2(é, 7) for E is by definition positive definite if it is positive 
for all non-vanishing admissible sets £1, £2, 7:(x) satisfying the equations (2:3). 


IV’. An extremal arc E satisfies the condition IV’ if its second variation 
is positive definite. 


The condition IV’ is applicable to an admissible arc which has no corners 
and satisfies conditions I and III’, since such an arc is necessarily non-singu- 
lar and an extremal [12, pp. 112, 117]. 

The sufficiency theorem of Hestenes to be considered here is now the fol- 
lowing one: 


THEOREM 2:1. If an admissible arc E has no corners and satisfies the condi- 
tions I, lly, III’, IV’ with a set of multipliers lo=1, I(x), e, then J(E) is a 
strong relative minimum. 


Every admissible arc E satisfies the multiplier rule with none or a limited 
number of linearly independent non-vanishing sets of multipliers having], =0, 
It is said to have order of abnormality equal to q if it satisfies I with g and only g 
such sets lo, =0, (0 =1, -- - , g). When g=0 it is said to be normal. 
A set of non-vanishing multipliers with /)>=0 will be called an abnormal set 
of multipliers. 

For an admissible arc with order of abnormality equal to g the equation 


2 
:4) [Fev] + = 0 


with F, =1;,(x)@, is for each o an identity in the variables £,, =7;(x.), since 
this is what the first equation (2:2) becomes for the multipliers J), =0, Js.(x), 
€,o When the end values of dx, dy; are replaced by those of £, y/+7;. The 
usual integration by parts applied to the sum 


Bg Poy ni + Foyni 


gives the equation 


Ze 2 


so that for every admissible set of variations satisfying the equations ®, =0 
we find with the help of equations (2:4) and (2:5) the relations 


2 
(2:6) [Fevns] = 0, Cue, = 0. 


The matrix of the q sets of values ¢,, (¢ =1, - - - , g) is necessarily of rank 


4 (2:5) | 


374 G. A. BLISS [May 


g. Otherwise there would be a linear combination of these sets vanishing 
identically, and, according to a remark made above, the same combination 
of the linearly independent complete sets /o,, ls.(*), €,4e would then also vanish 
identically, which is impossible. In the following paragraphs the variable p 
is understood to have as its range a subset of the numbers u=1, - - - , p such 
that the determinant | e,,| is different from zero, and the variable v will have 
the range complementary to that of p. The second equation (2:6) then shows 
that for an admissible set £1, £2, 7:(x) the equations V, =0 are consequences of 
the equations =V, =0. 

THEOREM 2:2. Let E be an admissible arc without corners which satisfies 
the hypotheses of Theorem 2:1 with a set of multipliers 1) =1, ls(x), e,, and let p 
and v be variables whose ranges are determined by the linearly independent ab- 
normal sets of multipliers of E as described in the last paragraph. Then the arc E 
is normal for the modified problem of minimizing the functional J(C)+e,W,(C) 
in the class of admissible arcs C satisfying the reduced system of equations 
3 =, =0, and the arc E with the multipliers lo =1, ls(x), e, satisfies the hypoth- 
eses of Theorem 2:1 for the modified problem. Furthermore if J(E)+e,,(E) 
is a strong relative minimum for the modified problem, then J(E) is a similar 
minimum for the original problem. 


It is easy to see that the arc EZ is normal for the modified problem. For if E 
had for that problem a set of non-vanishing multipliers of the form /,=0, 
I(x), e,, the set 1) =0, Js(x), e, =O, e, would be multipliers for Z and the origi- 
nal problem, necessarily linearly expressible in terms of the g sets Jo, =0, Js,(x), 
€ue (0 =1,---,q). This is, however, impossible on account of the fact that 
the determinant | e,,| is not zero. 

The arc E satisfies the hypotheses of Theorem 2:1 for the modified prob- 
lem with the multipliers /) = 1, J(x), e,, as one readily sees by an examination 
of the conditions I, IIy, III’, IV’. For the condition IV’ one needs to note 
that on account of the second equation (2:6) the restricted system &; = V, =0 
implies the complete system = V, =0. 

Since the class of arcs in which a minimizing one is sought for the modified 
problem includes as a subclass those among which a minimizing arc is sought 
for the original problem, and since on the subclass the values of the func- 
tionals J(C) +e,,(C) and J(C) are equal, the last statement of the theorem is 
evidently true. 

The remarks made at the end of §1 are now applicable for the most part 
to the problem of Bolza also. As a result of Theorem 2:2 it is clear that no 
generality is lost by proving Theorem 2:1 for normal arcs only, and the proof 
for such arcs turns out to be in some respects simpler than for the abnormal 


1938] CALCULUS OF VARIATIONS 375 


arcs included in the proof of Hestenes. A normal arc is a non-singular arc of 
the class in which a minimizing arc is sought in the sense that near every 
normal arc there are an infinity of other arcs of the class [12, pp. 49, 51]. 
The minimum problem near such an arc is therefore never trivial. Near an 
abnormal arc E, on the other hand, there may be no other arc of the class in 
which a minimizing one is sought, as in the case when y¥,(£) vanishes and is a 
strong relative minimum or maximum in the class of admissible arcs satisfy- 
ing the conditions ¢3 =¥2= - - - =~,=0. In this case the minimum problem 
near E is trivial. The variety of types of abnormal arcs is evidently very 
great. Those included in the sufficiency theorems of Hestenes are of a special 
type closely related to normal arcs. Other important special types can doubt- 
less be described and discussed, and it might be useful to have results of this 
kind. But it seems likely that a comprehensive theory would at this time be 
exceedingly elaborate and difficult, and perhaps impossible. 

When the number of the end conditions y,=0 is equal to the number 
2n+2 of end values x,, yi, (s=1, 2) the problem is said to have fixed end 
points. An admissible arc E is by definition normal on a sub-interval x’x’’ if 
its corresponding sub-arc is normal relative to the problem with fixed end 
points on that interval. The assumption that an arc E is normal on every 
sub-interval is evidently undesirable, for the same reason that it would be 


undesirable to assume for the problem of §1 that the determinants of order m 
of some particular set belonging to the matrix ||¢z,,|| are all different from 
zero. For the problem of Mayer, which is the problem of Bolza with integrand 
function f identically zero, every minimizing arc is abnormal on every sub- 
interval, as has been pointed out by Carathéodory [6, 7] and others. No 
theory based upon the assumption of normality on sub-intervals can there- 
fore be effective in this important case. 


BIBLIOGRAPHY 


1. Bolza, Vorlesungen ilber V ariationsrechnung, 1st edition, 1909; 2d edition, 1933. 

2. Bliss, The problem of Lagrange in the calculus of variations, American Journal of Mathe- 
matics, vol. 52 (1930), pp. 673-744. 

3. Morse, Sufficient conditions in the problem of Lagrange with variable end conditions, American 
Journal of Mathematics, vol. 53 (1931), pp. 517-546; see also Morse, The problems of Lagrange and 
Mayer under general end conditions, Proceedings of the National Academy of Sciences, vol. 16 (1930), 
pp. 229-233. 

4. Morse, Sufficient conditions in the problem of Lagrange with fixed end points, Annals of Mathe- 
matics, vol. 32 (1931), pp. 567-577. 

5. Bliss, The problem of Bolza in the calculus of variations, Annals of Mathematics, vol. 33 (1932), 
pp. 261-274. 

6. Carathéodory, Die Theorie der zweiten Variation beim Problem von Lagrange, Sitzungsberichte 
der Bayerischen Akademie der Wissenschaften, 1932, pp. 99-114. 

7. Carathéodory, Uber die Einteilung der Lagrangeschen V ariationsprobleme nach Klassen, Com- 
mentarii Mathematici Helvetici, vol. 5 (1933), pp. 1-10. 


376 G. A. BLISS 


8. Graves, On the Weierstrass condition for the problem of Bolza in the calculus of variations, 
Annals of Mathematics, vol. 33 (1932), pp. 747-752. 

9. Bliss and Hestenes, Sufficient conditions for a problem of Mayer in the calculus of variations, 
these Transactions, vol. 35 (1933), pp. 305-326. 

10. Hestenes, Sufficient conditions for the general problem of Mayer with variable end points, these 
Transactions, vol. 35 (1933), pp. 479-490; also Contributions to the Calculus of V ariations 1931-1932, 
The University of Chicago. 

11. Hestenes, Sufficient conditions for the problem of Bolza in the calculus of variations, these 
Transactions, vol. 36 (1934), pp. 793-818. 

12. Bliss, The problem of Bolza, Mimeographed lecture notes, University of Chicago, Winter 
Quarter, 1935. 

13. Morse, Sufficient conditions in the problem of Lagrange without assumptions of normalcy, these 
Transactions, vol. 37 (1935), pp. 147-160. 

14. Hestenes, The problem of Bolza in the calculus of variations in parametric form, American 
Journal of Mathematics, vol. 58 (1936), pp. 391-406. 

15. Reid, The theory of the second variation for the non-parametric problem of Bolza, American 
Journal of Mathematics, vol. 57 (1935), pp. 573-586. 

16. Hestenes, On sufficient conditions in the problems of Lagrange and Bolza, Annals of Mathe- 
matics, vol. 37 (1936), pp. 543-551. 


UNIVERSITY OF CHICAGO, 
Cuicaco, ILL. 


A THEOREM IN FINITE PROJECTIVE GEOMETRY 
AND SOME APPLICATIONS TO NUMBER 
THEORY* 


BY 
JAMES SINGER 


A point in a finite projective plane PG(2, p"), may be denoted by the 
symbol (x, %2, x3), where the coordinates x, x2, x3 are marks of a Galois field 
of order p", GF(p"). The symbol (0, 0, 0) is excluded, and if & is a non-zero 
mark of the GF(p"), the symbols (x1, x2, x3) and (kx, kx2, kx3) are to be 
thought of as the same point. The totality of points whose coordinates satisfy 
the equation + u2%2-+u3x3 =0, where m1, are marks of the GF(p"), not 
all zero, is called a line. The plane then consists of p?"+ p"+1=g points and g 
lines; each line contains p*+1 points. 

A finite projective plane, PG(2, p"), defined in this way is Pascalian and 
Desarguesian; it exists for every prime p and positive integer m, and there is 
only one such PG(2, p”) for a given p and m (VB, p. 247, VY, p. 151). 

Let A, be a point of a given PG(2, p”), and let C be a collineation of the 
points of the plane. (A collineation is a 1-1 transformation carrying points 
into points and lines into lines.) Suppose C carries Ao into Ai, A; into 
Ao, - + - , Ax-1 into Ao; or, denoting the product C-C by C?, C-C? by C*, etc., 
we have C(Ao) = Ai, C?(Ao) = Az, - - , C*(Ao) =Ao. If is the smallest posi- 
tive integer for which C*(Ao) =Ao, we call k the period of C with respect to the 
point Ao. If the period of a collineation C with respect to a point Ag is 
q (=p?"+p"+1), then the period of C with respect to any point in the plane 
is g, and in this case we will call C simply a collineation of period g. 

We prove in the first theorem that there is always at least one collineation 
of period g, and from it we derive some results of interest in finite geometry 
and number theory. 

Let 


(1) — a3x* — b3x — c3 = 


be a primitive irreducible cubic belonging to a field GF(p") which defines a 
PG(2, p"). A root X of equation (1) can then be used as a generator of the 


* Presented to the Society, October 27, 1934, under a different title; received by the editors 
April 22, 1937. 

t These definitions are taken directly from the paper by Veblen and Bussey, Finite projective 
geometries, these Transactions, vol. 7 (1906), p. 244, referred to later as VB; and from the textbook by 
Veblen and Young, Projective Geometry, vol. 1, pp. 1-25, 201, referred to later as VY. 


377 


378 JAMES SINGER [May 


non-zero elements of a GF(p*") which contains the given field as a subfield. 
By means of the equation we can express any power of X in terms of }?, 
X, and 1 with coefficients in the GF(p"), that is, 


(2) = ar? + DA + a=0,1,---. 


Conversely, any three marks, a, b, c, not all zero, of the GF(p") will uniquely 
determine a power of \ and therefore a non-zero mark of the GF(p*"). We call 
b;, c; the coordinates of d*. 

Since \ is a generator of the non-zero elements of the GF(p*"), the first 
p*"—1 powers of \ are distinct and \°=\”*"-!'=1. The powers of X in the 
GF(p") are 


(3) = + 1. 


Two non-zero marks, \“ and X’, of the GF(p*") will be called similar if their 
ratio is a mark of the GF(p"), that is, if w=v (mod q). If the coordinates 
of a mark \“ are a., bu, Cu, the coordinates of a similar mark will be ka,, kdu, 
kc, since the coordinates of a mark in the GF(p*) are 0, 0, k. 

Let the g distinct points of the plane defined by the given field be called 


(4) Ao, Ai, »Ag-1, 


and suppose the notation so chosen that the coordinates of A, are (du, bu, Cu), 
u=0,1,---,qg—1, where the a’s, b’s, and c’s are given by (2). If & is any non- 
zero element of the GF(p"), then (kau, kbu, kc.) also are the coordinates of A.. 
The possible choices for the coordinates of the point A, then correspond to 
the coordinates of all the marks 


(5) Aerie, j= 0, 2, 


similar to the mark X“. A point A, may then be identified with the class of 
similar marks (5). 

Two similar non-zero marks of the GF(p*") are linearly dependent with 
respect to the GF(p"). Conversely, two non-zero linearly dependent marks of 
the GF(p*") are similar. A point can then be considered as the totality of 
(non-zero) marks of the GF(p*") linearly dependent with respect to the 
GF(p") on a given non-zero mark of the GF(p”). In the same way, a line can 
be considered as the totality of (non-zero) marks of the GF(p*") linearly de- 
pendent with respect to the GF(p") on a given pair of linearly independent 
marks of the GF(p*"). The plane is the totality of (non-zero) marks of the 
GF(p*") linearly dependent with respect to the GF(p") on a given set of three 
linearly independent marks of the GF(p*"). Any four marks of the GF(p*") 


3 
2 


1938] FINITE PROJECTIVE GEOMETRY 379 


are linearly dependent with respect to the GF(p"); hence the plane exhausts 
all the non-zero marks of the GF(p*"). 
We now prove the following theorem: 


THEOREM. There is always at least one collineationof period (= p?*+p"+1) 
in the PG(2, p”). 


Consider the transformation given by 
V1 = A341 + xe, 
(6) + 3, 
Ys = C3%1, 


which sends the point (x1, x2, x3) into the point (1, ve, ys), a3, bs, cs being the 
coefficients in equation (1). A transformation of this type is a collineation, 
indeed, a projective collineation (VB, p. 253). But from (2) we have 


Gi+1 = asa; + bi, 
(7) bisa = + Gi, $= 0,1,--- 
Ci41 = 


Hence the transformation (6) sends the mark \“ into the mark \“*+!, and 
therefore the collineation sends the point A, into the point Au 1,4=0,1,---, 
qg—2. The point A, is sent into the point Ao. The theorem is therefore 
proved. 

This theorem has several immediate and interesting consequences. The 
points and lines of a PG(2, p*) can be exhibited as a rectangular array of g 
columns and ~"+1 rows; the elements of the array are the points, and the 
points in a column are the points of a line (VY). By means of the theorem 
we can show that the points and lines of the plane can be exhibited in a 
regular array; that is, one in which each row is a cyclic permutation of the 
first. For let the line containing the points A, and A, also contain the points 
A4a,, Aa, , We write dp and for 0 and 1, respectively and for the 
sake of brevity, we denote a point A, by its subscript wu. 

Consider the array 


do doti dot2 dht+q-1) 
dy aq,+1 d, + (q — 2) d, + (q — 1) 
(8) dg dh +(q-—2) +(q-1) 
dy da+2 --- da+(q—2) 


If all these integers are reduced modulo g, so that each lies in the range 


4 

| 

4 

‘| 


380 JAMES SINGER [May 


0, 1,---,q—1, each row will be a cyclic permutation of the first and each 
row will represent the totality of points (4). The integers in the (i+1)st 
column are equal to the corresponding ones of the ith column increased by 
unity. The collineation (6) will then carry the ith column into the (+1)st 
(and the last column into the first) hence, since the integers in the first column 
represent the points of a line, the integers in any column will represent the 
points of a line. 

The first two columns of the array (8) cannot be identical, for then q, the 
number of points in the plane, would equal p"+1. They must then represent 
distinct lines and thus will have one and only one integer in common since 
two lines intersect in just one point. This implies that the first column can 
have only the one pair, do, di, of consecutive integers, modulo g. For if d., d, is 
another pair of consecutive integers, where 1#d,=d,+1 (mod q), the first 
two columns would have the integers d; and d, in common. Since the first col- 
umn cannot have more than one pair of consecutive integers, modulo qg, no 
column can have more than one pair of consecutive integers, modulo q. It fol- 
lows that no two columns of the array are identical. For if the (¢+1)st and the 
(j+1)st were identical, we would have d)+i=d,4+/, diti=d,+j, d,A1, all 
modulo g. By subtracting the first congruence from the second, we see that 
d,, and d, are consecutive. But this is impossible, hence the columns of the 


array (8) must represent the q distinct lines of the plane. The array is, there- 
fore, a regular array exhibiting the points and lines of the plane. 

The regular array leads to an interesting result in the theory of numbers. 
Consider those columns of the array (8) which contain the integer 0=dp; 


These columns represent the pencil of lines on the point Ao. Hence in the 
square array (9) of "+1 rows and columns, the p"(p"+1) integers not on the 
principal diagonal are all distinct, and therefore are congruent, modulo gq, 
to the integers 1, 2,--- , p?"+" in some order. The p*+1 integers on the 
principal diagonal are all zero. We have thus proved the following theorem: 


THEOREM. A sufficient condition thai there exist m+-1 integers, 


(10) do, d,, du; 


namely, 
dg—dyg +--+ do—dy 

(9) dz — do d, — dy—dy 


1938] FINITE PROJECTIVE GEOMETRY 381 


having the property that their m?+-m differences d;—d;, i#j;1,7=0,1,---,m, 
are congruent, modulo m?+m-+-1, to the integers 


(11) 1,2,---,m?+m 
in some order is that m be a power of a prime.* 


We will call a set of integers such as (10) having the property described in 
the theorem, a perfect difference set of order m+-1. If the integers (10) form a 
perfect difference set, so will the integers dy) +d, di:+d, --- ,dn+d, for any d. 
Hence the integers in any column of the regular array (8) are a perfect differ- 
ence set. Also, the integers in the set 


(12) tdo, td\,--- , tdm 


will form a perfect difference set whenever ¢ is relatively prime to m*+m-+1. 
This is true since the integers #, 2¢, 3t, - - - , (m?-++-m)t, when reduced modulo 
m*+-m-+-1, will be a rearrangement of the integers (11). 

If (10) is a perfect difference set and k is any integer in the set (11), the 
congruence d,—d,=k (mod m?+m-+1) has a unique solution d,=d,, d,=d,, 
d,, d, in (10). Consider now the set of integers 
(13) do, am 
defined by the congruences 

a; = dix, — d; (mod m? + m + 1), 4=0,1,---,m. 


(the subscript m-+1 is to be replaced by the subscript 0). It follows from the 
definition of the a’s that if k=d,—d,, then 
modulo m?+m-+1. That is, any integer k of (11) is congruent, modulo 
m’?+m-+1, to a circular sum of the integers of (13), where by a circular 
sum we mean a sum of consecutive integers of (13), considering a, and dy 
as consecutive. Since there are m?+-m+1 such circular sums, including the 
sum a)+a,+ --- +4a,, which is congruent to 0, modulo m?+m-+1, any in- 
teger of the series 


(14) 0,1,2,---,m?+m 


is congruent to one and only one circular sum of the integers of (13). The set 
of integers (13) is therefore a perfect partition of m?+m-+1 in the sense of 
Kirkman. ft It is to be noted that the order in which the integers of (13) are 


*In connection with this theorem, see the proposed problem and discussion by O. Veblen, 
F. H. Safford, and L. E. Dickson in the American Mathematical Monthly, vol. 13 (1906), pp. 46 and 
215, and vol. 14 (1907), p. 107. 

+ Kirkman, On the perfect r-partitions of r?—r+1, Transactions of the Historical Society of 
Lancashire and Cheshire, vol. 9 (1857), pp. 127-142. The r of Kirkman’s paper is equal to m-+1 here. 
The problem of perfect partitions has been studied by a number of authors since Kirkman’s time. 


q 

it 

8 

a 

j 

| 

| 


382 JAMES SINGER [May 


written is important, the same integers in a different order will usually not 
form a perfect partition. 

If we start with the perfect partition (13), we can obtain in an obvious 
way the perfect difference set (10). A perfect difference set can be developed 
into an array such as (8). If the integers are now interpreted as points and 
the columns as lines, it is an easy matter to verify that the array represents 
the points and lines of a finite projective geometry. Whether m must be a 
power of a prime, and whether, if it is, the plane is necessarily Pascalian and 
Desarguesian, are still open questions. 

Let dj, di ,---,dn', be a perfect difference set. It will contain just one 
pair of consecutive integers, modulo m?+m-+1,for the congruence d/ —d, =1 
(mod m?+m-+1) has a unique solution. Suppose that d,’ —d,’ =1; then the 
set 


dj —di,d{ —di,---,dn 


will be a perfect difference set and will contain the integers 0 and 1. Suppose 
also that each integer is reduced so that it lies in the range (14). We call such 
a set a reduced perfect difference set. Any perfect difference set leads to a 
unique reduced perfect difference set. Two reduced perfect difference sets 
wi!l be called identical if. they contain the same integers. The order in which 
these integers are written is, of course, immaterial. Two perfect difference sets 
will be called equivalent if their reduced perfect difference sets are identical. 
Two perfect partitions will be called equivalent if their corresponding perfect 
difference sets are equivalent. If the integers (10) of a reduced perfect differ- 
ence set are written in normal order, that is if d;<dj4:, the corresponding 
perfect partition will be called normal. If two perfect difference sets or two 
perfect partitions are equivalent, the corresponding normal perfect partitions 
will be identical, not only with respect to the integers involved, but also with 
respect to the order in which they are written. Thus, any two colurnns of the 
array (8) will lead to identical normal perfect partitions, and conversely. 

We now investigate the number of distinct perfect difference sets or, 
what is the same thing, the number of distinct perfect partitions of a given 
order. All known examples arise from a regular array exhibiting the points 
and lines of a PG(2, p”) defined by means of a GF(p"). We limit ourselves to 
to such perfect difference sets. The number m?+m-+1 is now g= p7"+ p" +1. 

First of all, the sets (10) and (12) are equivalent if tis a power of p. (Clearly, 
any power of is relatively prime to q.) To see this, let 


A ao, Aa, Aa 


pn? 


be the points of the plane corresponding to the integers (10), and let 


1938] FINITE PROJECTIVE GEOMETRY 383 


Xai, - - - be the marks of the GF(p*") whose coordinates are the 
same as those of the points. The marks \‘, A‘, - - - , \‘4>" will then corre- 
spond to the integers (12). If u, v, and w are any three integers of (10), then, 
since A,, A,,and A, are collinear, there will exist three marks hi, ke, ks of the 
GF(p") such that 


(15) Ryd“ + kod” + = O. 
Raising each side of (15) to the pth power, we get 
(16) + + = Oz. 


(The other terms in the multinomial expansion will drop out because each 
coefficient will be a multiple of and p=0 in the GF(p").) Since ky”, ke”, ks? 
are in the GF(p"), equation (16) shows that the marks \“, \”*, and \?” are 
linearly dependent with respect to the GF(p"). Hence the points A ju, A pv, 
and A, are collinear, and the perfect difference sets (10) and (12) are equiva- 
lent when ¢= p. The same argument shows that (10) and (12) are equivalent 
when ¢ is equal to any power of p. 

Secondly, it appears from all known examples, although a general proof 
is still lacking, that (10) and (12) will be distinct if ¢ is prime to g and is not 
a power of p (mod q). However, if = —1, the sets will be distinct since in this 
case the integers in the normal perfect partition corresponding to the perfect 
difference set (12) will be the same as those in the normal perfect partition 
corresponding to the perfect difference set (10), but in reverse order. Since a 
perfect partition cannot contain two equal integers, a perfect partition is 
necessarily distinct from its inverse; hence the set (10) will be distinct from 
its inverse (12) for t= —1. 

It also seems to be true that if (10) and 


(17) dj, dn 


are any two perfect difference sets of the same order, there is a ¢ for which 
(12) and (17) are equivalent. If these statements are true, the number of dis- 
tinct perfect difference sets (or the number of distinct perfect partitions) for 
a given p” is equal to 


o@) 


where $(q) is the Euler function, the number of positive integers not greater 
than and prime to g. This number is even, since each perfect difference set 
can be paired with its inverse. 

I append a partial list of the (reduced) perfect difference sets and their 


3 
7. 


4 
¢ 
| 
| 
| 
| 
| 
| 
| 
| 
} 
t 
| 
‘Bi 


384 JAMES SINGER [May 


corresponding normal perfect partitions. I give a single set for each p", the 
remaining ones can easily be found by the methods given above. 


3n 


perfect difference set perfect partition 


3 

4 14 16 

3 7 15 31 36 54 63 16 5 18 9 10 

3 7 15 31 63 90 116 16 32 27 26 Ii1 
136 181 194 204 233 10 29 5 17 18 


27 49 56 61 77 


12 18 
13 32 36 43 52 210 19 479 5 
3 12 20 34 38 81 8&8 29 8 14 4 43 7 6 10 
104 109 5 24 
3 16 23 28 42 76 82 213 7 5 14 34 6 4 33 
86 119 137 154 175 18 17 21 8 


The preceding concepts are susceptible of immediate generalization. Let 


(1’) — — — = 0 


be a primitive irreducible (k+1)st degree equation belonging to a GF(p"). A 
root A of the equation is a generator of the non-zero elements of a GF(p(*+»"), 
By means of the equation, we can express any power of \ in terms of \*, 
Af-1,.--, 1, that is, 


(2’) i= + i, k+1, {= 0, 1, 


Conversely, any k+1 marks a, , not all zero, of the GF(p") 
will uniquely determine a power of A, and hence a non-zero mark of the 
GF(p'*+)). The k+1 marks will be called the coordinates of that power of X. 

The GF(p‘*+»") defines a k-dimensional finite projective geometry, 
PG(k, p"). An h-dimensional space , h=0, 1, - - - , &, is defined as the totality 
of marks of the GF(p‘*+») linearly dependent with respect to the included 
GF(p") on h+1 linearly independent marks of the GF(p‘*+»"). A point is a 
zero-space, a line is a 1-space, etc. Any +1 linearly independent marks of 
an h-space will define the same h-space. These definitions are equivalent to 
those in the paper by Veblen and Bussey (loc. cit.) if the coordinates of a 
point are interpreted as the coordinates of any mark in a class of similar 
marks. 


2 7/2/01 
2} 21/2/01 
73| 8101 
2* | 273/12 |0 1 
127 
238 255 
3 wi4iet39 126 4 
3? | 91/12 |0 1 3 9 (1626 «16 4 
81 10 
5 31/10 | 0 
7 57| 12 |0 
11 | 133 | 36 | 0 
13 | 183 | 40 | 0 


1938] FINITE PROJECTIVE GEOMETRY 385 


Let the sum p’*"+ p%-D"+ --- +p"+1 be denoted by g,, h=0,1,---. 
If the g; distinct points of the PG(k, p”) are denoted by the integers 


(4’) O,1,---,¢@— i, 


the points, lines, planes, etc., of the geometry can be exhibited as a regular 
array in the form of a k-dimensional rectangular matrix whose elements are 
these integers. The integers in a properly chosen ( —1)-dimensional face of 
the matrix represent the points of a (k—1)-space. The remaining (k—1)- 
spaces are the (k—1)-dimensional layers parallel to this face. The integers 
in these layers are obtained by successively adding 1’s to the integers of the 
first face. The integers in a properly chosen (k—2)-dimensional face of a 
(k—1)-dimensional face or layer represent the points of a (k —2)-dimensional 
space, etc. The existence of this regular array follows from the existence of the 
transformation 


= + 


= + 
= @k4+1,k41%1- 


This transformation sends an h-space into an h-space since it preserves linear 
dependence. It sends the mark \“ into the mark \“+!. The regular array can 
then be constructed. 

The regular array yields the difference set of g,_; integers 


(10’) do, +> , 


having the property that their differences, d;—d;,i#j;i,7=0,1,---,Q+1—1, 
are congruent, modulo g;, to the integers 


(11’) 


each integer of (11’) being congruent to qx_2 of the differences. The difference 
set (10’) leads to a partition 


(13’) Go, 41, °° * 


having the property that each of the integers in (11’) is congruent, modulo q,, 
to exactly qx: circular sums of (13’). The sum of all the integers of (13’) is 
congruent to 0, modulo gy. 


BROOKLYN COLLEGE, 
BROOKLYN, N. Y. 


7 
al 
| 
(6’) 
i 
{| 
q 
| 
4 


SYMMETRIC AND ALTERNATE MATRICES IN 
AN ARBITRARY FIELD, I* 


BY 
A. ADRIAN ALBERT 


INTRODUCTION 


The elementary theorems of the classical treatment of symmetric and al- 
ternate matrices may be shown, without change in the proofs, to hold for 
matrices whose elements are in any field of characteristic not two. The proofs 
fail in the characteristic two case and the results cannot hold since here the 
concepts of symmetric and alternate matrices coincide. But it is possible to 
obtain a unified treatment. We shall provide this here by adding a condition 
to the definition of alternate matrices which is redundant except for fields of 
characteristic two. The proofs of the classical results will then be completed 
by the addition of two necessary new arguments. 

The theorems on the definiteness of real symmetric matrices have had no 
analogues for general fields. They have been based on the property that the 
sum of any two non-negative real numbers is non-negative. This is equivalent 
to the property that for every real a and b we have a?+b*=c? for a real c. 
But a?+6?=(a+5)? in any field of characteristic two and we shall use this 
fact to obtain complete analogues for arbitrary fields of characteristic two 
of the usual theorems on the definiteness of real symmetric matrices. 

Quadratics forms may be associated with symmetric matrices and the 
problem of their equivalence is equivalent to the problem of the congruence 
of the corresponding matrices. This is true except when the field of reference 
has characteristic two where no matric treatment has been given. We shall 
associate quadratic forms in this case with a certain type of non-symmetric 
matrix and shall use our results on the congruence of alternate matrices to 
obtain a matrix treatment of the quadratic form problem. 

The classical theoremsf on pairs of symmetric or alternate matrices with 
complex elements will be shown here to be true for matrices with elements 
in any algebraically closed field whose characteristic is not two. This will be 
seen to imply that any two symmetric (or alternate) matrices are orthog- 
onally equivalent if and only if they are similar. But the proof fails for 
fields of characteristic two. 

* Presented to the Society, April 10, 1937; received by the editors April 26, 1937. 

+ Cf. L. E. Dickson, Modern Algebraic Theories, chap. 6. See also the report of C. C. MacDuffee, 
Ergebnisse der Mathematik, vol. 21 (1933), part 5, for this material as well as the classical results 


referred to above. The theory will also be found in J. H. M. Wedderburn’s Lectures on Matrices, 
American Mathematical Society Colloquium Publications, vol. 17, 1934. 


386 


SYMMETRIC AND ALTERNATE MATRICES 387 


We shall prove the existence of two similar symmetric matrices with ele- 
ments in a field § of characteristic two which are not orthogonally equivalent 
in the algebraically closed extension of §. Our treatment of the theory of the 
orthogonal equivalence in § of characteristic two will be rational, that is, no 
algebraic closure properties of § will be assumed. Our formulation will in- 
volve a recasting of the theory of similarity of square matrices and then a 
corresponding parallel treatment of the theory of orthogonal equivalence. In 
particular we shall obtain a complete determination of the invariant factors 
of any symmetric matrix in § of characteristic two. 

The generalized transposition concept called an involution* J of the set 
of all n-rowed square matrices arises naturally in any rational treatment of 
orthogonal equivalence. The consequent study of the J-orthogonal equiva- 
lence of J-symmetric and J-alternate matrices will be introduced here and 
various important special types treated in subsequent papers. 


I. CONGRUENCE THEORY 


1. Elementary concepts. Let § be an arbitrary field, and let A =(a;;) 
(i,7=1, - - - , m) be an m-rowed square matrix with elements a;; in §. We use 
the customary notation A’ for the transpose of A and call A symmetric if 
A’=A. We shall modify the usual definition of alternate matrices however, 
and make the classically consistent definition: 


DEFINITION. A matrix A is called alternate (or skew-symmetric) if A’ = —A, 
and the diagonal elements a;; of A are all zero. 


Notice that when the characteristic of § is not two the final part of our 
definition is redundant. But when the characteristic is two every symmetric 
matrix has the property A = —A’ and the condition will be shown to be es- 
sential. We shall also call a matrix A non-alternate symmetric if A= A’ and A 
is not alternate according to our definition above. This last condition is re- 
dundant except for fields of characteristic two, in which case we are simply 
assuming that A =A’ has a non-zero diagonal element. 

Two square matrices A and B with elements in a field § are said to be 
congruent in § if there exists a non-singular matrix P with elements in § such 
that B=PAP’. It is easy to prove the following lemmas:f 


Lemma 1. Let B be obtained from A by any permutation of its rows followed 
by the same permutation on the columns. Then B and A are congruent in §. 


* See the author’s paper, Involutorial simple algebras and real Riemann matrices, Annals of 
Mathematics, vol. 36 (1935), pp. 886-964, p. 894 for the definition and some elementary properties 
of involutions. 

+ The proofs of these lemmas may be found in the author’s Modern Higher Algebra, chap. 5, 
University of Chicago Press, 1937. 


nq 
i$ 
2 
> 
as 
a 
| 
i 
| 


388 A. A. ALBERT [May 


Lema 2. Replace the ith row of A by the sum of this row and any linear com- 
bination of the remaining rows. Follow this with the corresponding column re- 
placement. Then the resulting matrix B is congruent to A. 


If G and H are square matrices, the matrix 


is called the direct sum of G and H. This notion has an immediate generaliza- 
tion to the direct sum 


G; 


(2) 
G, 

of square matrices G;* called the components of A. It is clear that A is sym- 
metric if and only if its components are symmetric. Also A is alternate if and 
only if the components of A are all alternate. But in a field of characteristic 
two a matrix A may be non-alternate symmetric and yet may have some 
alternate components but has at least one non-alternate component. We shall 
show later that such matrices are always congruent to diagonal matrices, that 
is, to direct sums of one-rowed square matrices. Our proofs will depend partly 
on the almost trivial consequence of Lemmas 1 and 2. 


Lemna 3. Let A have the form (1). Then A is congruent to 


0 H,/’ 


for any Go congruent to G and H, congruent to H. 


A principal sub-matrix of A is a sub-matrix whose main diagonal is a part 
of the main diagonal of A. The determinants of principal sub-matrices are 
called principal minors of A. Then it may easily be shownf that Lemmas 1 
and 2 yield the following theorem: 


Lemma 4. Let G be a non-singular principal sub-matrix of a symmetric or 
alternate matrix A. Then A is congruent in § to 


Ca 

Bea 

0H 

whose principal minors having |G| as sub-determinant have the same values as 


those of A. 


* We shall henceforth use the notation diag[G,, - - - , G,| for (2) to simplify printing. 
t See the author’s Modern Higher Algebra, chap. 5. 


1938] SYMMETRIC AND ALTERNATE MATRICES 389 


The one-rowed principal minors of A are the elements a;;. When they are 
all zero and A’ = +A the two-rowed principal minors are 
0 aij 
(3) = + (a; = (i,j =1,---,m), 
0 
if and only if A =0. Thus we have the following lemma: 


Lema 5. Every symmetric or alternate matrix A +0 has a non-zero one- or 
two-rowed principal minor. 


We shall close our discussion of the tools of our theory by proving the 
lemma: 


Lemna 6. Let A and B be r-rowed non-singular matrices, and let 


(00) 


be n-rowed square matrices. Then By =PA,P’ for a non-singular P if and only if 


§ 
(5) Po QAQ’ = B, 


where Q and R are non-singular. 


It is clear that the lemma implies that if QAQ’ = B for Q non-singular, we 
may choose any R and S such that R is non-singular and obtain PA P’ = By. 
These results are an immediate consequence of the computation 


’ K’ AQ’ QAK’ BO 
K R/ \0 0/ \S’ R’ KAQ’ KAK’' 0 0 
if and only if QAQ’=B, QAK’=KAQ’=KAK'=0. But B is non-singular 
and so is A. Hence Q is non-singular, so therefore is AQ’, and K =0 is our only 
condition. 


2. Congruence of alternate matrices. We shall prove the following the- 
orem: 


THEOREM 1. Every matrix congruent to an alternate matrix is an alternate 
matrix. 


For let x, - - - , x, be independent indeterminates over § 
(7) Sn), A = (aij) (i,7 =1,--+,m). 


If a;;= —a;;, the quadratic form 


ii 


t=1 


; 

3 

4 

n 

it} 


390 A. A. ALBERT 


When also A is alternate the a;; are all zero and 
xAx’ =0. 
We let B=PAP’, y=(yi,---, yn), and have B’= —B. But 
yBy’ = buy? = xAx’, x= yP, 


so that yBy’=xAx’=0 in the y; and the 6;;=0. Hence B is alternate. 
The proof is of course unnecessary for fields § of characteristic not two. 
Notice that it implies the theorem: 


THEOREM 2. Every matrix congruent to a non-alternate symmetric matrix is 
non-alternate symmetric. 


We may also easily prove the following theorem: 


THEOREM 3. Two alternate matrices are congruent in § if and only if they 
have the same rank 21. 


For if A £0 is alternate, its diagonal elements are all zero. By Lemma 5 
A has a two-rowed principal minor 


with a#0. By Lemma 4 the matrix A is congruent to 


a 
0 
But 


0\ /0 —a\ /a"' O 0 O 0-1 
0 Of \0 1 a 1 1 
so that by Lemma 3, A is congruent to 
4) 
0 Ao/ 
We apply Theorem 1 to see that A, is alternate. A repetition of this process 
by the use of Lemma 3 shows that A is congruent to the direct sum 
(11) diag [Ei,--- , E:, 0], E; = E, 
where 2¢ is the rank of A. Any other alternate matrix of rank 2¢ is congruent 
to (11) and hence to A, and we have Theorem 3. 


The proof given above is valid for general fields only because we have 
proved Theorem 1. Notice that we have the following consequence: 


[May 


1938] SYMMETRIC AND ALTERNATE MATRICES 


THEOREM 4. Every alternate matrix of rank 2t is congruent in § to 
0 -i O 
(12) Ir 0), 
0 0 0 
where I, is the t-rowed identity matrix. 


This new form of Theorem 3 is a consequence of (11) and Lemma 1. 

3. Congruence of non-alternate symmetric matrices. The Lemmas |, 3, 
4, 5 may be applied to an arbitrary symmetric matrix. They show that every 
symmetric matrix is congruent to a direct sum of matrices of the forms 


(a), 4 (a Oin §). 


a0 
But as in (10) we have 


By Lemma 1 we have the preliminary reduction given by the following 
lemma: 


Lema 7. Every symmetric matrix is congruent in § to a matrix 
(14) diag [D, G, 0], 


where D is a diagonal matrix with elements in § and G is a direct sum of two- 
rowed matrices 


The reduction of symmetric matrices with elements in a field § of char- 
acteristic not two is evidently completed by the fact that 


“(1 


and then that the matrix G is congruent to a diagonal matrix whose diagonal 
elements are all 2 or —2. But the corresponding transformation in § of char- 
acteristic two is clearly singular. We complete our reduction in this case by 
the computation in the following theorem: 


391 

t 

| 

i 


392 A. A. ALBERT 


THEOREM 5. Let a¥0 be in § of characteristic two and 
00 110 
(17) A=(|001), P=|10a 
10 lia 
Then 
PAP’ = als, 


where I; is the three-rowed identity matrix. 

The result above seems quite remarkable, as one might expect that the 
matrix A which has an alternate sub-matrix would not be congruent to a 
multiple of the identity matrix. We verify the computation in 
1 1 a00 111 
106) 101 
11ia)\(010 aai)\0aa 

a@ ata a+a 
=lat+a a a+a = 


a+aat+aa+t+a+a 


(18) 


since a+a=0. 
As an immediate consequence of Theorem 5 and Lemma 7 we have the 
following theorem: 


THEOREM 6. Every non-alternate symmetric matrix is congruent to a diagonal 
matrix. 


The problem of finding when two diagonal matrices are congruent is not 
solvable in a general field, as the structural properties of the field are involved 
in this question. It is usual then to assume that § is a field such that for every 
a of § there exists a b in § such that b? =a. Then for this case two non-alternate 
symmetric matrices are congruent if and only if they have the same rank. Finally, 
as in the classical theory, we may obtain a so-called Kronecker reduction of 
non-alternate symmetric matrices to diagonal form. The results are nearly 
the same as in the classical theory; they depend essentially on Lemma 6, and 
we shall not give the proofs. The only difference is that when in the Kronecker 
reduction we obtain a matrix 

A,_2 0 


with EZ the matrix of (15), we use (16) if § does not have characteristic two 


[May 


1938] SYMMETRIC AND ALTERNATE MATRICES 393 


and obtain a corresponding pair of diagonal elements 2, —2. But when § has 
characteristic two the Kronecker reduction is completed by the use of Theo- 
rem 5. Thus we replace E by 
«) 
0 


where a is any diagonal element obtained at any stage of the reduction. 

4. Definite symmetric matrices. The field 9%’ of all real numbers has the 
characteristic property that if a and b are in %’, there exists a c in ®’ such 
that c?=a?+?. This result has the analogue a?+6? = (a+b)? in fields of char- 
acteristic two, and we shall use these results to obtain an analogue of the con- 
cept of definiteness of real symmetric matrices. 


DEFINITION. A symmetric matrix A with elements in a field § will be called 
semi-definite if A is congruent in § to 


°) 
0 0/’ 
where r is the rank of A, and I, is the r-rowed identity matrix. A non-singular 


semi-definite matrix will be called definite. 


In the remainder of the section we assume that § has characteristic two.* 
Clearly Theorem 1 implies that alternate matrices are never semi-definite. 
However by Theorems 4 and 5 the matrix 


4) 


is semi-definite for every alternate matrix A. We use this result in the proof 
of the following theorem: 

THEOREM 7. A non-alternate symmetric matrix with elements in § of char- 
acteristic two is semi-definite if and only if its diagonal elements are the squares 
of elements of §. 

If A <0 is one-rowed, it has the form (a?), is congruent to J, and is 
definite. Hence our theorem is true for one-rowed matrices, and we make an 
induction on the order of A. Let A = (a;;) be n-rowed, 


aig = aig = a? (a;, ai; in §). 


At least one a;~0, and there is no loss of generality if we assume that a,0. 


* As the field of reference in what follows will sometimes be general and sometimes of character- 
istic two we shall henceforth designate that it has characteristic two by writing © except when the 
condition is explicitly stated. 


4 
! 
| 
| | 


394 A. A. ALBERT [May 


Multiply the first row and column of A by a,~' and replace a; by 1, A by a 
congruent matrix B = (b,;), bj; =b?, b;;=b;;. We add bj times the first row of B 
to its ith row and replace bi: by 0, by = (6: +0)? =c?. The corre- 
sponding column transformation then replaces bi; by 0, and leaves c? un- 
altered. Thus A is congruent to the direct sum 


where the diagonal elements of C are c?, c; in §. If C is alternate we have 
seen that the matrix (20) is semi-definite. Otherwise C is semi-definite by our 
induction and so is A. 

Conversely let A be semi-definite so that 


~ \p; \0 0/ Pi) 
Then if P:=(d,;), the diagonal elements of P:P/ have the form }°j_,d}; 
= (d5-10;;)*. Similarly the diagonal elements of P;P} are squares. 


CoroLiary. The principal minors of a semi-definite symmetric matrix are 
the squares of elements of §. 


For every principal sub-matrix B of a semi-definite matrix is semi-definite 
by our theorem. If | B| =0, our result is true. Let | B| 0, so that B is definite 
and B=PP’, | B| =| P|? as desired. 

The classical result on real symmetric matrices states that A is positive 
semi-definite if and only if every principal minor of A is non-negative, that 
is, the square of a real number. Theorem 7 has a weaker hypothesis than the 
theorem about the real field but, in view of our corollary, the same conclusion. 
Thus our result is a true analogue of the corresponding real theorem. 

In a later section we shall require the theorem: 


THEOREM 8. Let A be a semi-definite matrix with elements in an infinite field 
§ of characteristic two. Then there exist quantities a in § such that 


(21) aI +A 
is definite. 

For the diagonal elements of (21) are squares in §‘” when this is true of A. 
The determinant d(a) of (21) is a polynomial in a with leading coefficient 
unity, and thus there exist infinitely many elements a in §® such that 
d(a) £0, (21) is definite. 

5. Hermitian matrices. The classical theory of the conjunctivity of Her- 
mitian matrices with elements in a field & holds for arbitrary fields. To verify 


1938] SYMMETRIC AND ALTERNATE MATRICES 395 


this note that the theory has already been shown to be valid for fields of 
characteristic not two.* Let now § be a field of characteristic two, and let 
R be a separable quadratic field over §. This is the only case that need be 
considered. Then 


(22) K=F°O), 

so that every k of & has the form 

(23) = ky + kof (ki, ke in 
The correspondence 

(24) = ki + + 1) 


is an automorphism of & over §® with the property k=k. The theory of 
Hermitian matrices is then a theory of matrices A with elements in &. Write 


(25) A=(a;), A = (dij). 


Then (A’) =A’ is called the conjugate transpose of A; and we call a matrix 
A Hermitian if A =A’. 

Two Hermitian matrices A and B with elements in § are said to be con- 
junctive in & if 
(26) B = DAD’ 


for a non-singular D with elements in &. It is clear that all of the results lead- 
ing up to our reduction theorem to diagonal form hold. 
Our reduction theory is now an immediate consequence of 


since 8=0+1, 0+8=1. We combine this result with the Hermitian analogue 
of Lemma 7 and have proved the following theorem: 


THEOREM 9. Every Hermitian matrix is conjunctive in & to a diagonal matrix 
with elements in §. 


When § is a perfect fieldf we have the usual result: 


THEOREM 10. Any two Hermitian matrices with elements in R over a perfect 
§ of characteristic two are conjunctive in R if and only if they have the same rank. 


* Cf. the author’s Modern Higher Algebra. The theory is almost exactly the same as in L. E. 
Dickson’s Modern Algebraic Theories. 

+ A perfect field % of characteristic two has the property that every a of § is equal to 62, b in §. 
Such fields with an over-field & = § (0) exist. For the definition see van der Waerden’s Moderne Algebra, 
vol. 1, as well as the author’s own Modern Higher Algebra. 


| 

i 

i] 

| 
lL & 


396 A. A. ALBERT 


The proof of the above result is trivial and will be omitted. 


II. QUADRATIC FORMS 


1. The matrices of a quadratic form. Let 
tn), y = In) 


with independent indeterminates x, - - - , y, over any field §. If A is a sym- 
metric matrix with elements in § the form «Ay is called a symmetric bilinear 
form in the variables x;, y;. A trivial computation then shows that two such 
bilinear forms are equivalent if and only if their matrices are congruent. Thus 
the symmetric matrix and the symmetric bilinear form theories are equiva- 
lent. Analogous results evidently hold when A is Hermitian, «A is an Her- 
mitian bilinear form. Moreover the theory of Hermitian quadratic forms +A z 
is also equivalent to the theory of Hermitian matrices since we may always 
choose , %n, to be independent indeterminates over &. In 
the theory of fields of characteristic not two the theory of quadratic forms is 
equivalent to that of symmetric matrices. This is not true for fields § of char- 
acteristic two, and we shall develop this theory here. It was developed for the 
case of a finite field by L. E. Dickson (American Journal of Mathematics, 
vol. 21 (1899), p. 194), but the results obtained there do not hold for an 
arbitrary field § of characteristic two. We introduce the theory as follows: 
Every quadratic form in independent indeterminates has the form 


(28) f = Xn) 


This expression is clearly not unique, but if also 
(29) f= DL xia05x;, 


then, by equating the coefficients of x? and x;x;, we have 
Write x=(x1, --- , Xn), A =(ai;), Ao=(doi;), so that 

(31) f = xAx’ = xAox’. 


Then (30) states that A+A’=A,+Ao’, A and Ap» have the same diagonal 
elements. The matrix A»—A has zero diagonal elements and (A o—A)’ 
Conversely let f=xAx’ and Ap=A+N, 
where N=—N’ and the diagonal elements of N are all zero. Then 
Ay+Aj =A+A’, f=xA ox’. We have proved the following theorem: 


[May 


1938] SYMMETRIC AND ALTERNATE MATRICES 397 


THEOREM 11. Let f=xAx', g=xAox’, Ao>=A+N. Then f=g if and only 
if N is an alternate matrix. 


In the study of quadratic forms with coefficients in a field § of charac- 
teristic not two it is customary to choose a unique matrix 


(32) A = (ajj), = + 


so that A is symmetric. Then the theory of quadratic forms is equivalent to 
that of symmetric matrices. This is impossible in § of characteristic two 
both because (32) is impossible and because if A =(a,;)=A’, then f=xAx’ 
=)-_,a::2. It is now natural to make the following definition: 

DEFINITION. A quadratic form f with coefficients in a field § of characteristic 
two ts called diagonal or non-diagonal according as f does or does not have the 
ex pression 


(33) f = ayxP + (a; in §). 


We shall now assume that the characteristic of §® is two. By a non- 
singular linear transformation 


(34) 


i=1 


with matrix D = (d;;) we carry a quadratic form f into what is called an equiv- 
alent form g. If A is a matrix, f=xAx’, and y=(y,---, yn), then x=yD, 


(35) g = yDA(yD)' = yDAD'y’. 


Hence DAD’ is one possible matrix of g. 
Consider in particular the case where 


(36) f= xP + 


and write 


10 
(37) + y25 v2 = Je, += (y1, ye) yD. 


Then in a field §°” we have 
(38) f=g=yi, 


since A matrix of g is 


| 

1 0 

A =( ), | 
0 1 


398 A. A. ALBERT [May 


which is not merely incongruent to the two-rowed identity matrix A but does 
not even have the same rank. However 


~ 1/7 \o1 +( 4): 


We now consider the general theory of quadratic forms in the light of the 
above example. Write 


i,j=1,---,n 


isi 
Then the matrix 
Gin-1 Gin 


O Aen 
(42) 
0 0 | 


is uniquely determined by f. We make a non-singular transformation « = yD 
and carry f into 


(43) g = yBy’, 


where B has the form B=(6;;), b;;=0 for i>j. Then 
(44) B—DAD'=N 


is alternate by Theorem 11. 

THEOREM 12. Let f be a quadratic form with unique matrix A of (42). Then 
a non-singular transformation with matrix D carries f into an equivalent form 
with matrix B=DAD'+N where N is the unique alternate matrix chosen so that 
the elements below the diagonal in B are all zero. 

2. Diagonal quadratic forms. If A = A’, then B= DAD’+N is symmetric: 
Hence g = yBy’ is a diagonal quadratic form. This gives the following theorem: 

THEOREM 13. Every quadratic form equivalent to a diagonal quadratic form 
is a diagonal quadratic form in §®. 

A simpler proof is given as follows. We let f=aix? + --- +a,%,? and use 
(34). Then 


i 1 


j=1 t= j=1 


since (a+b)?=a?+0b? in §. However we have now proved the theorem: 


1938] SYMMETRIC AND ALTERNATE MATRICES 399 


THEOREM 14. Two diagonal quadratic forms f and 
are equivalent in § if and only if the coefficients b; are representable as values 
S(dij, dnz) of f such that the corresponding determinant | di;| ~0. 


We cannot go into questions of representation in a general field §. How- 
ever we do have the following theorem: 


THEOREM 15. Let § be perfect. Then any two non-zero diagonal quadratic 
forms in n indeterminates are equivalent. 


For 


f= = (a;x;)? = ( > = 


i=1 
where a? =4a;, a; in §. Similarly 


g= Li biy? = 


Then both f and g are equivalent to the same quadratic form and hence to 
each other. 

3. Non-diagonal quadratic forms. Non-diagonal quadratic forms f are 
not equivalent to diagonal quadratic forms by Theorem 13. Then f=xAx’, 
A+A’#0. By Theorem 12 we have B=DAD’+N, N+N’=0, and B+B’ 


=D(A+A’)D’. The matrix A+A’ is alternate and has rank 2r by Theorem 
3. Moreover we may always carry f into a form g whose matrix B has the 


property 


45 D(A = B+ B= W,= 


with J, the r-rowed identity matrix. Write 


Ca 
B= ‘ 
B; 


where B, has 2r rows and columns. Since B has elements below the diagonal 
all zero this is true of B, and B;. But 


Bi+ Bi Be W, 0 
Bi Bs + Bj 0 0 


so that B.=0, Bit BY =W,. Then 


Git T, 0 
(46) ( 0 


| 
i 
= 
= 
00 


400 A. A. ALBERT [May 


and G, and G2= 8B; are diagonal matrices. It is clear that r is an invariant of f, 
and we have proved the following theorem: 


THEOREM 16. Every non-diagonal quadratic form is equivalent to a form 
(47) f = DX + + 
i=1 
Moreover two forms f and g are equivalent only if they have the same rank in- 
variant r. 


Let f of the form (47) go into g of the form (47) under a transformation 
of matrix D. Then D leaves A +A’ invariant. We suppose that 


(48) 


and by Lemma 6 see that 


(49) D(A + AD! 


if and only if R is a non-singular matrix, where 


H K 
(50) D= ( pe HW,H’ = W,. 


0 
Then 

HG: +1,)H’ + KG:K’ KG2R’ 

(51) B+N= pap ) 

RG2K’ RG2R’ 


with N alternate, and 


G Tr, 0 
(52) B= ( 10 + ), 
0 Goo 


where Gop is a non-alternate symmetric matrix whose diagonal elements coin- 
cide with those of RGR’. This gives the theorem: 


THEOREM 17. Let 


2r 
f= (x + + + ax? 


i=2 i=2r+1 


2r n 
g (x biy? + Vivr+1 +---+ + > biy2. 


i=l i=2r+1 


1938] SYMMETRIC AND ALTERNATE MATRICES 


Then f and g are equivalent in § only if the forms 

(53) ax?, diy? 
i=2r+1 i=2r+1 

are equivalent. 


We next write 


(54) H 
= ’ 
VM 
where L, M, U, V are r-rowed square matrices. Then 
OT. OT U IN /L’ V’ 
T0 M V/ \U’ 
+ LU’ UV'+ 
VU'+ ML’ MV’+VM’'/)’ 


if and only if 
(56) UL’ =(UL’)', MV’=(MV’'’, UV'+LM’ 
where J is the r-rowed identity matrix. Then if 
(57) G, = 
where C and J are r-rowed diagonal matrices, we have 
watt = (0 (ve as) Co ae) 
V M/\O J VC MJ/ \U' M’ 
LCL’ + UJU' 0 
( 0 VCV' + 


(58) 


where NV, is alternate. Also 
L OT LU’ LM’ 
V M/ \0 0 0 V/\U’ M’ VU’ VM’ 
0 )+r +( 0 
~\o o/7’ 
since ]+LM’=UV’. The last matrix in (59) is alternate; hence the di- 
agonal matrix Gi) has the same diagonal elements as 


LCL’ + UJU’' + LU’ 0 ) 
0 VCV’ + MJM'+VM') 


(59) 


(60) KG2K’ + ( 


401 


402 A. A. ALBERT [May 


The quadratic form with matrix (60) is a quadratic form }>_,b:y2. The part 
corresponding to the arbitrary matrix K is clearly 


2r n 
(61) Ze ask) Vis 
j=1 t= 2r+1 


for arbitrary k,;; and a; given as in Theorem 17. We now let f and g be two 
arbitrary forms satisfying the necessary conditions of Theorems 16, 17 so that 
we may write 


(62) f= DY ax? + (ain? + + 


i=2r+1 i=1 


(63) > aiy? + > + Vidi-r + Distr 


t=2r+1 i=1 


Also write 


Then we have proved the following theorem: 


THEOREM 18. The forms f and g are equivalent in § if and only if there 
exist r-rowed square matrices L,M, U,V such that LU’ and VM’ are symmetric, 
UV'+LM'=I,, and the quadratic form 


2r 
(65) biy2 — 2(LCL’ + + LU’)2’ — w(VCV’ + MJM’ + VM’)w’ 


i=1 


may be expressed as (61) for ki; in §. 

It does not seem possible to materially simplify (65) for an arbitrary field 
§‘”. However we may obtain an analogue of the classical complex number 
case by proving the theorem: 

THEOREM 19. Every non-diagonal quadratic form with elements in an alge- 
braically closed field § of characteristic two and rank invariant r is equivalent 
to one and only one of the forms 


(66) + + XpXer, 


Hence two non-diagonal quadratic forms are equivalent in § if and only if they 
have the same rank invariant r and the same type (66) or (67). 


n 
n r 


1938] SYMMETRIC AND ALTERNATE MATRICES 403 


For if §® is algebraically closed, we use Theorem 15 and transform the 
form (53) into «3,4: if it is not identically zero. By Theorem 16 our above 
theorem is true if we can show that the form 


r 

(ajx? + + 


is equivalent to (66). This is clearly true if it can be proved that the two 


forms 
ax? + xy + by?, XY 


are equivalent for every a and b of §. If a=b=0 the result is trivial. We 
may therefore assume a0 without loss of generality. The equation 


w+w+ ab=0 
has two distinct roots A, A+1 in §. Thus 
(w — A)\(w —A— 1) = ab. 
Put w=axy-' and multiply by a—'y? to obtain 
a—ly?(a?x?y-? + axy! + ab) = ax? + xy + by? = XY, 

where 

X = a'y(w — A) = a y(axy! — A) = — ay 
and 

Y = — A— 1) = y(axy? — A — 1) = ax — yp(A+ 1). 
The determinant of the transformation is 


1 a 
a"r~ A+1 


|= 1, 


and we have proved our theorem. 


III. PArrs OF SYMMETRIC MATRICES 


1. The problem. The theory of the congruence of pairs of symmetric 
matrices has been studied only in the classical case of matrices with com- 
plex elements. The results hold however for matrices with elements in any 
algebraically closed field whose characteristic is not two. It will be our pur- 
pose in the present chapter to develop these results and to show precisely 
where the classical proofs fail. 

2. The nth roots of a matrix. Let $ be an integral domain with the prop- 
erty that every two elements of $ have a greatest common divisor in $ which 


q 


404 A. A. ALBERT 


is linearly expressible in terms of them. Then the congruence 
(68) ax=b (c) 
has a solution x for every a prime to c. 


We consider a prime (or irreducible) element 7 of $ and study the con- 
gruence 


(69) f(x) =0 


Here f(x) is a polynomial in an indeterminate x with coefficients in 3, and e 
is any positive integer. It is clear that (69) implies that, in particular, 


(70) f(x) =0 (x) 


must have a solution. Let then f(x) =0 (*-") have a solution x», and write 
x =x9+ymr*-!. The Taylor expansion of f(«) then implies that (69) is satisfied 
if and only if f(xo) =0 (2°), that is, 


(71) yf'(%) (x), 


where =f(x»). It is clear that (71) has a solution if 40 This 
gives the lemma: 


Lemna 8. Let f(x) =0 have a solution xo such that f'(xo) 40 (x). Then 
there exists a solution of (69) for every e. 


The result of our lemma may now be applied to non-singular matrices A 
with elements in a field §. Suppose that 


is a factorization of the minimum function of A into powers of distinct irre- 
ducible functions 7;(£), and let $ be the integral domain of all polynomials 
in the indeterminate ¢. We consider the congruence x"=£ (g). This con- 
gruence may be easily shown to be solvable modulo g if and only if it is solva- 
ble modulo 7,“ for i=1, - - - , t. Now the derivative of «"—£ is nx"-!=0 (z,) 
if and only if nx=0 (7;). But x»—£=0 (x,) so that either £=0 (7;) or n=0 
(x;). Then £=0 (w;) means that £ is a factor of g() which is impossible when 
A is non-singular. Also 40 (x;) for x; irreducible unless the characteristic 
of § divides n. Hence the conditions of our lemma reduce to «*—£=0 (7;). 
The equation 7;() =0 defines a field §(E;) over § equivalent to the set of all 
residue classes modulo 7;. Then x*—£=0 (7;) if and only if £;=27 for x; in 

* This is the standard technique for the study of congruences f(x)=0 (mod #) in the theory of 
numbers (as in L. E. Dickson’s Introduction to the Theory of Numbers, p. 16, ex. 4). We are using the 


analogous property of the polynomial domain (cf. Lemma 35.21, p. 60, of MacDuffee’s tract on The 
Theory of Matrices). 


[May 


1938] SYMMETRIC AND ALTERNATE MATRICES 405 


&(é;), and when this condition is satisfied we have [x(¢)|"»—£=0 (g(é)), 
[x(A) ]"=A. We have proved the following theorem: 


THEOREM 20. Let n be an integer not divisible by the characteristic of §, A be 
a non-singular matrix whose minimum function g(€) has the distinct irreducible 
factors f(€), §:=F(E:) be the corresponding algebraic fields over §. Then there 
exists a polynomial P(A) whose nth power is A if and only if the equations 
(72) & = 
have solutions x; in §(§:). 


We next consider singular matrices A. Then g(¢) = &*go(£), where go(¢) 40 
and r21. Thus «"=é (g) implies that x*—£=0 (£7), =1 
(é-!) which is impossible for r>1. This gives the theorem: 


THEOREM 21. Let n>1, A be a singular matrix whose minimum function 
g(&) is divisible by &. Then there exists no polynomial in A whose nth power is A. 


As an immediate corollary of the above argument we have the following 
theorem: 


THEOREM 22. Let n>1, A be a singular matrix whose minimum function 
has the form &g(é) where g(&) is not divisible by — and has irreducible factors 
with the properties of Theorem 20. Then there exists a polynomial P(A) with 
coefficients in § whose nth power is A. 


Theorem 20 may be applied to prove the following theorem: 


THEOREM 23. Let § be an algebraically closed field, n be an integer not di- 
visible by the characteristic of §, A be a non-singular matrix. Then there exists a 
polynomial in A with coefficients in § whose nth power is A. 


For the fields §; of Theorem 20 are all equal to §. Moreover x*=£; has a 
root x; in §; and we have our theorem. 


Theorem 23 does not hold for arbitrary matrices A if the characteristic 
of § divides n. For let § have characteristic p and B be the p-rowed square 
matrix all of whose elements are unity. A trivial computation shows that 
B?=0, B’=0. The matrix A=J+B is non-singular since =I. 
Now any polynomial in A has the form a)+a,B, a> and a, in §. Then 
(ao +a,B)® =a? #A for any a4, If n= pg we have (ao+a:B)" =a" This 
proves the following theorem: 


THEOREM 24. Let the characteristic of § divide n. Then there exist non- 
singular matrices A such that no polynomial $(A) has the property |o(A) |" =A. 


3. Equivalence of pairs of symmetric and alternate matrices. The result 
of Theorem 23 may be applied to the theory of equivalence of pairs of sym- 


406 A. A. ALBERT [May 


metric matrices. We assume that § is algebraically closed as is usual in the 
classical case. Suppose now that P and Q are non-singular matrices such that 


(73) PAQ = B, 
where A and B are either both symmetric or both alternate. Then (PAQ)’ 
=(Q’'A’P’=B' so that 
(74) PAQ=Q'AP’, =GA, 
where 
(75) G = P-"'0’ 
is non-singular. We now assume that the characteristic of § is not two and 
use Theorem 23 to obtain a polynomial ¢(G) such that 

[oG) ]* =G. 

Now AG’ =GA implies that AG’?=G?A, A(G’)*=G*A, Ad(G’) =o(G)A for 
any polynomial in G with coefficients in §. We use the ¢(G) above and have 
A = $G)A¢G’)", GA = = 

Write H =Q’ [¢(G) |-! and obtain 
HAH’ = = PAQ = PAQ. 

We have proved the following theorem: 

THEOREM 25. Let e= +1, A’=€A, P and Q be non-singular matrices such 
that (PAQ)’ =«(PAQ). Define 
(76) G=P0’, 
where &(G) is a polynomial in G determined so that its square is G. Then 
HAH'=PAQ. 

It is clear that the proof we have given of Theorem 25 is not valid for 
fields of characteristic two. The result itself does not hold. We shall prove 
this in an example given later. 


DeriniTi0n. Let A, B, C, D be n-rowed square matrices with elements in §. 
Then we say that the pairs (A, B) and (C, D) are equivalent in § if there exist 
non-singular matrices P and Q with elements in § such that 


PAQ=C, = D. 
We also call (A, B) and (C, D) congruent in § if there exists a non-singular 
matrix H with elements in § such that 
HAH’ =C, = D. 


1938] SYMMETRIC AND ALTERNATE MATRICES 407 


The notion of congruence of pairs may be applied to either symmetric 
or alternate matrices. It is clear that if A is symmetric, C must be symmetric, 
and when A is alternate C must be alternate. We have similar necessary con- 
ditions on B and D. Then Theorem 25 implies the following theorem: 


THEOREM 26. Let § be an algebraically closed field of characteristic not two. 
Then two pairs of alternate or symmetric matrices satisfying the trivial necessary 
conditions above are congruent if and only if they are equivalent. 


Conditions that two pairs of matrices be equivalent are expressed in the 
literature in terms of the invariant factors of the matrices Ax+B, Cx+B, 
where x is an indeterminate over §. This theory holds for an arbitrary §, and 
we shall not discuss these known results. 

4, Elementary applications. There are two simple consequences of the the- 
ory of pairs of matrices which seem interesting and appear never to have been 
noted in the literature. We first let A, B be two non-singular matrices with 
elements in an algebraically closed field § of characteristic not two, and ask 
the question as to when they are congruent. The answer is given as the corol- 
lary: 

I. Write A =Ai+A2, B=B,+Be where Ai=}(A +A’) and 
B,=}(B+B’) are symmetric, while A,=}(A —A’) #0 and B,=}(B—B’) 
~0 are alternate matrices. Then A and B are congruent if and only if 


Ax +A, Bu + By 


have the same invariant factors. 


For the theory of invariant factors states that (A, A:) and (B, B,) are 
equivalent if and only if Ax+A, and Bx+B, have the same invariant factors. 
Then HAH’ =B implies that HA’H’ = B’, H(A +A’)H’=B+B’, HA,H'=B, 
so that (A, A:) and (B, B,) are equivalent when A and B are congruent. Con- 
versely let (A, A1) and (B, B,) be equivalent so that 


PAQ=B, PAQ=B,. 
Then P(A —Ai1)Q0=B-—B,=B,=PA,0 and the pairs (Ai, Az) and (B,, Bz) are 
equivalent. By Theorem 26 they are congruent, HAiH’=B,, HA.H’ = Bz, so 
that HAH’ =B,+B.=B as desired. 
We next restrict our attention to the field € of complex numbers. Corol- 
lary 1 then states that two Hermitian non-singular matrices 


(77) A = A,+ B= Bi + Bai 


for real Ai, As, Bi, Bz are congruent in € if and only if the matrices Ax+Au, 
Bx+B, have the same invariant factors. An analogous question arises when 


408 A. A. ALBERT [May 


we consider two symmetric matrices (77). Then Ai, As, Bi, Be are all real 
symmetric matrices. We then ask for necessary and sufficient conditions that 
PAP’ =B for a non-singular P with complex elements. It is clear that then 
PA’'P’=B’, P(A+A’)P’=B+B’, so that 

PA,P’ = B,, PAP’ = Bo. 
The classical analogue of Theorem 26 states that two pairs of Hermitian 
complex matrices (A;, A2) and (B,, Bz) are conjunctive if and only if they are 


equivalent. They are clearly equivalent if and only if (A, A:) and (B, B,) are 
equivalent and we have the statement: 


Coro rary II. Two non-singular complex symmetric matrices A =A,+Adi, 
B=B,+ Bui for real symmetric A,, Ao, Bi, Bz are conjunctive if and only if 


Ax+Ai, Bxr+B, 


have the same invariant factors. 

5. Orthogonal equivalence. The theory of the orthogonal equivalence of 
two symmetric matrices in an algebraically closed field § of characteristic not 
two is equivalent to the theory of the congruence of pairs (A, B), (C, D) of 
which A and C are non-singular. The concept of orthogonal equivalence is 
defined as follows. Let D be a square matrix with elements in a field §. 
Then we call D orthogonal if DD’ =I is the identity matrix. Clearly then 
DD’ =D'D. We consider two symmetric or alternate matrices A, B and call 
them orthogonally equivalent if DAD’=B for an orthgonal D. Since 
B=DAD-~' is similar to A when A and B are orthogonally equivalent, the 
condition of similarity is a necessary condition. However we may actually 
prove the following theorem: 

THEOREM 27. Let A and B be both symmetric or both alternate matrices with 
elements in an algebraically closed field § whose characteristic is not two. Then 
A and B are orthogonally equivalent if and only if they are similar. 

For let PAP-'=B. Then PIP-'=IT and the pairs (J, A) and (J, B) are 
equivalent. By Theorem 26 they are congruent, DID'’=I, DAD’ =B, D is 
orthogonal. The converse is trivial. 

Theorem 27 is an immediate consequence of Theorem 26. However the 
connection between the two results is even closer than is indicated by this 
fact. For assume that the result of Theorem 27 has been proved true inde- 
pendently of Theorem 26. Let now (A, B), (C, G) be any two pairs of matrices 
such that 


A’=A, C’=C, G=G, 


1938] SYMMETRIC AND ALTERNATE MATRICES 409 


where e= +1. Assume also that A and C are non-singular. Then (A, B) and 
(C, G) are congruent if and only if Ax+B and Cx+G have the same invariant 
factors. This statement of Theorem 26 follows from Theorem 27. We use the 
existence of a matrix Q such that QCQ’=J. Then the invariant factors of 
Ax+B and x1+PBP’ are the same. Similarly those of Cx+G and x] +QGQ’ 
are the same. But when Ax+B and Cx+G have the same invariant factors, 
so do x1 +QGQ’, xI+PBP’, QGQ’ is orthogonally equivalent to PBP’. We let 
QGQ’ = DPBP’D’, DD’ = 1 
and have 
HAH’ = = 
H =Q"'DP, G = HBH', = HAD’. 


The above proof cannot, of course, be carried out for fields § which are 
not algebraically closed. For it depends essentially upon the property that 
QAQ’ =I for every non-singular A with elements in §. However it is clear 
that in an arbitrary § a criterion for the congruence of two pairs always gives 
a criterion for orthogonal equivalence. For we may take one matrix in each 
pair to be the identity matrix. 

6. J-orthogonal equivalence The set Mt, of all m-rowed square matrices 
with elements in a field § is said to possess an involution J over § if there isa 
one-to-one correspondence 


AJ (A, AY in Mn) 
such that 
(78) =A, (A+ BY = (ABY = BUA, (al,) = al, 


for every A and B of M, and a of §. Here J, is the n-rowed identity matrix. 
I have proved* that every J is determinedf by 


(79) AJ = E“1A’E, 


where E is a non-singular matrix E= + E’. Call A J-symmetric if A =A/, 
J-alternate if A = —A/ and § does not have characteristic two. 

Let S be an automorphism of MM, over §. Then there exists a non-singular 
matrix P such that 


(80) AS = P“14P 
for every A. The involution 
(81) AS “IS 


* See the author’s paper, Involutorial simple algebras and real Riemann matrices, loc. cit. 
+ Note that conversely J determines E only up to a scalar factor. 


4 
{ 


410 A. A. ALBERT 


has been called an involution cogredient with J.* In fact 


AS = PAP-', = P’-14'P’, = E'P'-14' PE, 


(82) AS"IS = 


where E,=P’EP is congruent to E. The set M, may be thought of as the 

set of linear transformations of a vector space , and the replacement of any 

basis of # by any other replaces the matrices of M, by the similar matrices 

P-1AP. They then replace E by P’EP, J by S-'JS. Hence cogredient in- 

volutions are merely different representations of the same abstract involution. 
A matrix D is said to be J-orthogonal if 


(83) DID = In. 
Two matrices A and B are called J-orthogonally equivalent if 

(84) B = DYAD 

for a J-orthogonal matrix D. The case where A/ is the transpose of A, that 


is E=/,, has already been considered in Theorem 27. But we have the follow- 
ing generalization: 


THEOREM 28. Let § be an algebraically closed field of characteristic not two, 
J be an involution of M,, over §, A and B be matrices with elements in § such 
that AY =eA, BY =eB, e€= +1. Then A and B are J-orthogonally equivalent if 
and only if they are similar. 


For if A and B are J-orthogonally equivalent they are clearly similar. 
Conversely let PAP-'=B. Now E“B’E=cB, E-'A'E=eA so that 
EB, Ay=EA, Bi = 


where 6= +1 is the product of « and E’E-!= +1. Then the pairs (E, Ao) 
and (E, By) have the property (EPE“)E(P-) =E. 
By Theorem 26 there exists a non-singular matrix D such that 


D'ED = E, = Bo. 
But then 
DID = (E"D’E)D = 
and Dis J-orthogonal. Also B=E-'D’EAD=D/AD is J-orthogonal to A. 
* N. Jacobson, A class of normal simple Lie algebras of characteristic sero, Annals of Mathematics, 
vol. 38 (1937), pp. 508-517. Note that conversely the property in the preceding footnote implies 


that Zo defines an involution cogredient with that defined by E£ if and only if Eo is congruent to a 
scalar multiple of E. 


1938] SYMMETRIC AND ALTERNATE MATRICES 411 


The proof above indicates that the theory of the congruence of pairs of 
matrices (E, A), (C, B) over any field § is equivalent to the theory of J-orthog- 
onal equivalence. We are of course assuming that E and C are non-singular, 


E' A'=6A, B’=6B (e=+1,6= +1). 


Necessarily C must be congruent to E, and there is no loss of generality if we 
replace (C, B) by (EZ, B:), where B, is clearly congruent to B. Then (E, A) 
and (E£, B,) are congruent if and only if E-'A and E-'B, are J-orthogonally 
equivalent. In fact we define G’ = E-'G’E for any matrix G and have 


Ao = EA, Bo = EW-"B,. 
We then obtain 
Aj’ = = By = 


The proof of our theorem then shows that the J-orthogonal equivalence of 
Ao, By and the congruence of (£, A) and (£, B,) are equivalent concepts. No- 
tice that this is true for arbitrary fields §, and that we are not assuming the 
algebraic closure of § or even that the characteristic of § is not two. 


IV. SIMILARITY OF SQUARE MATRICES 


1. Reduction to primary components. If two matrices A and B are J-or- 
thogonally equivalent, they are similar. It follows that the properties of J-or- 
thogonal equivalence depend essentially upon the theory of the similarity of 
square matrices. The usual modern formulation of this theory is valid for an 
arbitrary field* but is in a form unsuited for the application we shall wish to 
make. Thus we shall give a new formulation. 

Our first assumptions from the classical theory are the following lemmas: 


Lemma 9. Let B be obtained from A by a row permutation followed by the 
same column permutation. Then B and A are similar. 


Lemma 10. Let f(x) =g(x)-h(x) be the characteristic function of an n-rowed 
square matrix A where g(x) is prime to h(x) and has leading coefficient unity. 
Then A is similar to 


where G, H have respective characteristic functions g(x), h(x). 


We are assuming that G, H, A have elements in a field § containing the 
coefficients of f(x), g(x), h(x), and that our similarity is similarity in §. The 


* For an exposition of the validity of this fact and proofs of the results assumed in this chapter 
see the author’s Modern Higher Algebra, chap. 4. 


412 A. A. ALBERT [May 


importance of the form (85) is principally due to the fact that for any G and H 
we have 


O ) 
86 A = 
(86) o(A) ( 0 (A) 


This means more precisely that if 
(87) o(x) = x7 + (a; in §), 
then G has m rows, H has »—™m rows, and 


(88) 
o(H) = --- + 


In particular if (x) is the minimum function of G and a,0, then 


But then x(x) is the minimum function of the matrix Gy formed by bordering 
G by »—™m rows and columns of zeros. 

We shall call two square matrices relatively prime if their characteristic 
functions are relatively prime. We also say that a square matrix is primary if 
its characteristic function is a power of an irreducible polynomial. Then 
Lemma 10 gives the following lemma: 

Lema 11. Every square matrix A is similar in § to a direct sum of relatively 
prime primary components the product of whose characteristic functions is that 
of A. 

Let g(x) and h(x) be relatively prime. Then a(x)g(x)+)(x)h(x) =1 for 
polynomials a and b. Define 6(x) =a(x)g(x), y(x) =1—6(«). Then y(Go) =Jm, 
5(Go) =0, y(Ho) =0, and 6(Ho) =J,-m, for any m-rowed square matrix Gp such 
that g(Go) =0, and any (n—m)-rowed such that h(Ho)=0. We use this 
result in the proof of the following lemma: 


Lemma 12. Let P be non-singular and 


G 0 Go 0 
(90) Ay = P( = ( 
0 H 0 Ho 


where the characteristic function g(x) of G and Gy is prime to the characteristic 
function h(x) of H and Ho. Then 


P, 0 
(91) -( ), = Go, P.HPz' = Hog. 
Pe 


SYMMETRIC AND ALTERNATE MATRICES 


Pe 
Ps Po 


It is clearly sufficient to prove P; and P,; zero. Form 7(Ao) and obtain 


In Im O 
(yo)? 
0 0 0 0 


(G's) 
P,0/ oJ’ 
and P,=0, P;=0 as desired. 
An evident induction now gives the following lemma: 


Lemma 13. Let 
(92) A= diag Ay = diag Gor] PAP. 
with relatively prime primary components G; having the same characteristic func- 
tions as Go;. Then P is the direct sum of matrices P; such that Gy; = PG;P7". 


2. Indecomposable matrices. A matrix A is called indecomposable in § if 
A is not similar in § to a direct sum of two matrices. We shall assume the 
known lemma:* 


Lemma 14. A matrix A is indecomposable in § if and only if its charac- 
teristic function is equal to the power 


(93) f(x) = x" — + = [d(x)]° (a; in §) 


of an irreducible polynomial d(x) and coincides with its minimum function. 
Every indecomposable matrix is similar in § to 


(0 


- 


an GQn-1 Gn-2°°** 


Conversely the minimum function of (94) is its characteristic function f(x) and 
(94) is indecomposable if and only if f(x) =d(x)¢ for an irreducible d(x). 


Lemmas 11, 13 reduce the problem of the reduction of a matrix to a direct 


* See the author’s Modern Higher Algebra, chap. 4. 


1938] 413 
Write 
so that 
| 
1 0 oso 
@ 
# 
red 


414 A. A. ALBERT [May 


sum of indecomposables under similarity transformations to the case where 
A is primary. When A is primary and has the characteristic function d(x)’, 
d(x) irreducible, the invariant factors of xJ—A are the characteristic func- 
tions of its indecomposable components. Then A is similar in § to 


(95) diag [Bi,--- , Bij, 


where B; has d(x)* as characteristic function and is indecomposable, 
(96) at---teae=f, 

We shall call the e; the indices of (95). In particular 

(97) = @, 


where d(x)¢ is the minimum function of both (95) and By. 
We now prove the following lemma: 


Lemma 15. Let d(x) =c[h(x)]| be irreducible in §, A be an n-rowed inde- 
composable matrix with c(x)* as minimum function, 


(98) h(x) = + + dn. 
Then the matrix 


is indecom posable and has d(x)* as characteristic function. 


For it is clear, from the fact that all elements of B are polynomials in A, 
that 


h(B) = B™ + b,B™" + --- + byl = diag [A,---, A], 


I the nm-rowed identity matrix. Then [c(A) ]*=0 so that [d(B) |’ =c[h(B) |° 
=0. It follows that the minimum function of B divides [d(x) |’. But d(x) is 
irreducible; thus it has the form d(x)’, g<e. However c[h(B) |’ =0 implies 
that [c(A) ]7=0 whence g2e. Then g=e. The degree of the minimum func- 
tion d(x)¢ of B is the order mm of B, and B is indecomposable. 

We next let B be a decomposable primary matrix so that we may assume 
that B is a direct sum of matrices B; which are indecomposable. If d(x)’ is 
the characteristic function of B, and the irreducible polynomial d(x) =c [h(x) ] 
as in Lemma 15, we may assume that each B; has the form (99). But the }; 


0 0 In 
(i=1,---,m-—1), 


1938] SYMMETRIC AND ALTERNATE MATRICES 415 


are the same in each B;, and an evident permutation of the rows and corre- 
sponding columns of B carries B into a similar matrix of the form (99), where 


A = diag [Ai,---, Ai] 


has the same indices as B. We state this result as in the following lemma: 


Lemna 16. Let d(x) =c[h(x) | be irreducible, h(x) be given by (98), and B have 
characteristic function [d(x) |. Then B is similar in § to a matrix (99), where A 
has the characteristic function c(x)/ and the same indices as B. 


3. Two canonical forms. A matrix N is called nilpotent of index e if N° =0, 
N*!#0. The minimum function of N is clearly x*. Then N is similar in § to 


where JN; is nilpotent of index e; and may be taken to be the e;-rowed square 
matrix 


0 0 
0 0 


Notice that V;=0 if e;=1. Also e=e,, N has order f=e:+ - - - +e,. The mat- 
rices N; are indecomposable, and thus a nilpotent matrix is indecomposable 
if and only if its index is its order. But the indices of WN are clearly the re- 
spective indices of its nilpotent indecomposable components WN; in the sense 
in which we defined index above. 

We have seen that the terminology of indices which we defined for arbi- 
trary matrices has precise connotations for nilpotent matrices. These con- 
notations are made precise also generally by the following theorem: 


THEOREM 29. Let the characteristic function of B be d(x)‘, where d(x) 
=x™+ayx""+ --- +n, (a; in &), is irreducible. Then B is similar in § to 


-,m— 


An Am-1 Am-2 


where N is a nilpotent matrix whose indices are the same as those of B. 


0 1 
0 0 1:--0 
0---0O 
Bit 
0 O ---0 
0 0 ---0 
A, 
| 


416 A. A. ALBERT [May 


Our theorem is an immediate application of Lemma 16 to the case where 
h(x) is irreducible, c(x)=x, d(x) =c[h(x)|=h(x). Then the matrix A of 
Lemma 16 has x as characteristic function and is our nilpotent matrix NV. 

The matrix (101) is a canonical form of a primary matrix. For some pur- 
poses other forms may be preferable. One such form is given in the following: 


THEOREM 30. Let d(x) =c(x”) be irreducible, 


0 


where By =A is an m-rowed square matrix with [c(x) |’ as characteristic function. 
Then B=B, has [d(x)\! as characteristic function and the same indices as A. 

For proof we take h(x) =x? in Lemma 16 and use an induction on k. This 
result is of particular use in case d(x) is an inseparable irreducible polynomial 
over § of characteristic p and is equal to c(x»*), c(x) separable. 

4. The algebra §[B]. The algebra §[B] of all polynomials in a matrix B 
with characteristic function d(x)’, d(x) irreducible, contains certain sub-fields 
and certain nilpotent matrices. We first prove the following theorem: 


THEOREM 31. Let the minimum function of B have the form d(x)* with d(x) 
irreducible. Write 


(103) d(x) = c(x*), 


where c(x) is separable and 


x = p*, 


according as d(x) is inseparable over § of characteristic p or is separable. Then 
there exists a nilpotent matrix N of index e such that 


(104) c(A)=0. 


Thus & = §[A ] isa separable field and the polynomial x* — A is irreducible 
in §. 


Our proof depends upon the known lemma :* 


Lema 17. Let c(x) be irreducible and separable. Then for every e there exists 
a polynomial g.(x) such that c[g.(x) | is divisible by c(x)¢. 
The matrix 7 = B* has c(x)* as minimum function and Lemma 17 implies 


* Loc. cit. chap. 10. The result trivially follows from our Lemma 8. 


0 I, 0 ---0 
0 0 | 
(102) B; =|- ‘ v = Gj=1,---,h), 
0 0 0.---T, 
= 1 


1938] SYMMETRIC AND ALTERNATE MATRICES 417 


the existence of a polynomial g(7) = Ao such that c(Ao) =0. Since c(x) is sepa- 
rable and irreducible the algebra & = § [Ao] is a field. We now prove the fol- 
lowing lemma: 


Lemma 18. The matrix c(T)=No is nilpotent of index e and the algebra 
& [No] =§ [7]. 

For the minimum function of N> is clearly x*. It remains to prove that 
R[No| which is contained in §[T] has the same order over § and equals 
§ [7]. It is sufficient to show that ao+a,No+ - -- for a; in 
if and only if the a;=0. But 0, ao=0 in &. Similarly all 
the other a;=0. 

The quantity T now has the form T=A+WN, where A is in &, 
N =(a,;+42No+ --- +a.1No**)No is nilpotent. But =c(A)+Na(N) 
=No, c(A) is nilpotent and in &. It follows that c(A)=0, No=Ne(N), 
N.’ =0 where g is the index of N. Thus ge. Similarly g<e and N has index e. 
We next prove the following lemma: 


Lemma 19. Let x”*—a be reducible in a field 8 containing a. Then a=6?, 
bin &. 


For x”*—a=g(x)h(x), where the constant term of g(x) is the ‘th power of 
a root £ of x**—a=0, g(x) has degree ¢. Hence £** = by in §, where s is prime to 
p, ris a power of p. But then ss;=1 (p*) and "=, in &. Since r< p* we have 

k=rp’, =a, bin R. 

We may finally show that x*—A is irreducible. For otherwise A =D”, 
D in &, c(A) =c(D”) =0, c(x*) is inseparable and has a root D in the separable 
field &. This is impossible, and our proof of Theorem 31 is complete. 

The matrix A has c(x) as minimum function and is similar to 


where G has c(x) as both minimum function and characteristic function. It is 
well known that the only matrices commutative with G are polynomials in G 
with coefficients in §. If AB=BA and we write B=(B;;), we obtain 
B.,G=GB;;, the B;; are in §[G]. Thus B may be regarded as a matrix with 
elements in the field & = §[G]. 

The matrix N of Theorem 31 is commutative with A and hence may be 
regarded as a nilpotent matrix of index e and elements in &. The matrix B 
is also a matrix over R but now has minimum function 


(106) — G)e, 


Then the construction of matrices with minimum function [c(x*) |’ is com- 
pleted by Theorems 30, 31, and the above. 


(105) diag [G,--- ,G], 
j 
| 


418 A. A. ALBERT [May 


5. Fields of matrices. Let c(x) define a separable field 8 = §[G] where G 
is a matrix in our canonical form given by (94) for the separable irreducible 
polynomial c(x) used as the f(x) of (93). Write 


c(x) = (x — a) (4% — an) 
with distinct a; in an algebraic extension of §. Then the Vandermonde ma- 
trix* 
(107) V = (aij), ai = a; (i,j = 1,---,m) 
is non-singular, 
(108) VV’ = (Sitn-2) 


has elements s;=)_/_,a/ in § and determinant the discriminant of f(x). A 
simple computation gives 


(109) V-'GV = a = diag , an]. 
Let H be any n-rowed square matrix with elements in §, and define 


Ho = (VV’)"H = (hi; in 


Then 
(110) VHV = V’HwW = (A(ai, a;)) (i,j =1,---,9), 


where 
l,-++,n 
A(x, 9) = 
Clearly HG=GH if and only if A(a;, a;)=0 for i#7, and H=h(G), where 
h(x) =A(x, x). 

The result that HG =GH if and only if H is a polynomial in G is true for 
other types of matrices G as well as those above. In particular it is true for 
indecomposable nilpotent matrices. Let now A be an n-rowed square matrix 
with irreducible minimum function d(x) of degree m. This is the case e=1 of 
our theory, and we do not assume that d(x) is separable. If d(x) =x, then 
A =0, and we discard this case. Otherwise A is similar in § to the matrix 


diag [G,--- ,G], 
where G is given by (94) for d(x), A is now a qg-rowed matrix with elements in 
a field §[G]. The only matrices commutative with A are g-rowed matrices 


* The method of proof of this section has been used many times in the theories of Riemann 
matrices and of linear transformations. For a partial list of references see the bibliography in the 
paper referred to in the first footnote on page 387. 


1938] SYMMETRIC AND ALTERNATE MATRICES 419 


with elements in §[G], and the minimum function of A over §[G] is x—G. 
However the analogous result cannot be obtained when e>1 since we cannot 
in general prove the existence of an inseparable sub-field & of our algebra of 
polynomials in a matrix. 

6. Consequences of the canonical forms. The transpose A’ of any matrix 
A is similar to A. For certain simple matrices the transformation carrying A 
into A’ assumes an interesting and simple form. We let A have the canonical 
form 
(O 1 0O---0 
0 0 1---0 
(111) 
0 0 0---1 
a 0 


so that the characteristic and minimum functions of A are f(x) =x"—a. Write 


(0 0---0 1 
0 0---1 0 


By an evident computation if 
(114) = ao + + + 
then 

aya 


aoa 
(115) 


0 1---0 0 
1 0---0 
Then 
0 a0---0 0 
a 0-:--0 0 
a 0 0-:--0 0 
0 0---0 1 
0 0 0---0 1 
(113) UA =|0 0-:--1 UA? = 
0 0 0---1 0 
00 i---0 0 
ap ay 
ao *** An-2 An-1 


420 A. A. ALBERT [May 


It is clear that UA is symmetric, and since U is symmetric and non-singular, 
(116) UAU-! = A’. 


We shall require (115) later. We shall also use the following existence theorem 
which is a consequence of Theorem 30 for the case p=2: 


THEOREM 32. Let n=2", f(x) =x"—a be irreducible in §. Then there exists 
an n-rowed square matrix A with f(x) as minimum function and such that 


(117) E"A’E = —A, S-14'S = A 
for an alternate matrix E and a symmetric matrix S. 


The theorem is true for m=2, k=1 since 


0-1 0 1 0 1 
satisfy (117) by direct computation. Assume the theorem true for k—1, write 
m =2*-!, and have Sm =Am. By Theorem 30 the matrix 


118 4= 


has f(x) as minimum (and characteristic) function. Then the matrices 


0 Fun 0 Sa 
0 Sm O 


are alternate and symmetric respectively, and if «= +1, 


O Sm In ‘SmAm 
0 An Ss. ESmAm O 
since A},Sm=SmAm. Put «= —1 and obtain EA=—A’E. The value e=1 
gives SA =A’S as desired. 


(120) 


V. ORTHOGONAL EQUIVALENCE IN § OF CHARACTERISTIC TWO 


1. The problem. Our principal interest will be in obtaining a complete 
determination of the invariant factors of any symmetric matrix whose ele- 
ments are in a field of characteristic two.* Part of this theory will be con- 
cerned with the orthogonal equivalence of symmetric matrices, and we shall 


* The field of reference throughout this chapter will be any field of characteristic two and we 
shall drop the notation 9} and simply use ¥. 


1938] SYMMETRIC AND ALTERNATE MATRICES 421 


show why it can happen that two symmetric matrices may be similar but not 
orthogonally equivalent. 

2. J-orthogonal equivalence.* Let J be an involution over § of the alge- 
bra Mt, of all m-rowed square matrices with elements in a field § (with any 
characteristic). Suppose that 


(121) AJ = ¢A, B= PAP“ 
for a non-singular matrix P. We may then prove the lemma: 
Lemma 20. The matrix B of (121) has the property 


if and only if PP is commutative with A. 

For BY = (P’)—-'eA PY = eB = ePAP-!, = AP!P. The converse is triv- 
ial. 

If B is any matrix similar to A so that B = PA P-', then B= DAD- if and 
only if D = PC where C is commutative with A. Then 


(123) DID = C/(PIP)C. 
But D is J-orthogonal if and only if D/D is the n-rowed identity matrix. 
Lemma 21. The matrix PA P-' is J-orthogonally equivalent to A if and only 


if P’P is congruent to the identity matrix under a transformation (123) with C 
commutative with A. 


Let us study the implications of Lemma 21 in a special case. Assume that 
A is such that the only matrices commutative with A are polynomials in A 
with coefficients in §. Then P’P =¢(A) is such a polynomial and C =C(A). 
If AY =A, then CY =C, and C/¢C if and only if 


(124) ¢=C? 


is the square of a polynomial in A. Conversely if ¢=Q/0=(Q", we have 
PIP =Q/0, D=PQ-"' is J-orthogonal, and 


(125) B = PAP“! = DAD“ 


is J-orthogonally equivalent to A. 

We shall see in an example later that it is possible for a given ¢(A) to have 
the form P’P but not the form [Q(A) ]*. Thus the restriction above is not a 
redundant consequence of the equation ¢(A) = P’P. 

* For recent results implying theorems on the J-orthogonal equivalence of matrices over an arbi- 
trary field of characteristic not two, see the papers of John Williamson in the American Journal of 


Mathematics, vol. 57 (1935), pp. 475-490; vol. 58 (1936), pp. 141-163; and vol. 59 (1937), pp. 399- 
413. 


(122) B! = 
& 


422 A. A. ALBERT [May 


In the remainder of the chapter we shall assume, unless it is otherwise 
stated, that § has characteristic two. We shall also restrict our attention to 
the case of ordinary orthogonal equivalence. 

3. Separable fields. Every separable field & over § is a simple extension 


(126) = §[A] 


of all polynomials in A with coefficients in §, where A is a root of an irreduci- 
ble separable equation 


(127) f(x) = 2™ — +--+ + an) = 0. 


We may easily prove the following theorem: 


THEOREM 33. Let f(x) of (127) be separable and irreducible in § of charac- 
teristic two. Then there exists an m-rowed symmetric matrix G with elements in 
& and f(x) as characteristic function. 


For let 


(128) 
0 


am Om-2°** 


Then every m-rowed square matrix with f(x) as characteristic function is 
indecomposable, has f(x) as minimum function, and is similar to Go. In 
(109) we saw that V-'G,V is a diagonal matrix and is thus symmetric. Thus 
V'Gs 


(129) VV'G) = G(VV’). 

The diagonal elements of VV’ are 
m m 2 

(130) Siti-2 = = ( ai) = (s;_1)?. 
j=1 j=1 


Hence VV’ is a definite matrix. It is this remarkable property which gives us 
our result, Theorem 33. 

We may now write VV’ = RR’, where R is a non-singular matrix with ele- 
ments in §. Then RR’Gj =G,RR’, R'-'. Define 


(131) G = 


and obtain G’= R’G; R’-'=G. The matrix G is symmetric and is similar to 
G». It is the desired matrix. Notice that 


1 ---@ 

0 0 1 


SYMMETRIC AND ALTERNATE MATRICES 
W = VR, W'W =T7, WGW-' = a, 
so that G is orthogonally equivalent in an algebraic extension of § to the 
diagonal matrix a. But then we may prove the lemma: 


Lema 22. Let ¥o=%[G] be the field of symmetric matrices consisting of all 
polynomials with coefficients in § of the matrix G of (131). Then the only alier- 
nate matrix of &o is the zero matrix. Moreover a matrix of &o is definite if and 
only if it is the square of a non-zero quantity of §o. 

For if ¥(G) is in §o, we have 


WYG@)W’ = ¥(a). 


If ¥(G) is alternate so is the diagonal matrix ¥(a). But then ¥(a) =0, ¥(G) =0. 
Let ¥(G) be definite so that we may write ¥(G)=H’H. Then W=W’ 
= = 


¥(a) = W¥@)W’ = (TV)(TV), 


where T=HR- has elements #;; in §. The elements in the first column of 
TV are 


ti(a1) = 


j=1 
Thus the element in the first row and column of (a) is 
m m 2 
= = ia) | = [r(a1)]?, 
i=1 i=1 


where r(a;) is in § [ai]. Then ¥(G) = [r(G) ]?, r(G) ¥0 in Fo. 
The only matrices commutative with G are quantities of §) = § [G], a field 
of symmetric matrices. We let 


(132) A = diag |G, -- - ,G] 
have n =qm rows, so that A is a g-rowed matrix with elements in §o. Let 
(133) M, over Fo 


be the set of all g-rowed matrices with elements in ¥o. Then M, is the algebra 
of all n-rowed matrices commutative with A. Every such matrix has the form 


B= (B;;) (Bi; in i,j = 1, q) 
and is an m-rowed matrix with elements in §. We define 
BT = (Bois), = Bi G,j=i,---, q) 


so that B’ is the transpose of B considered as a g-rowed matrix of M, over Fo. 


™m 

(i =1,---,m) 4 

,m). 

i 

“a 

f 

if 


424 A. A. ALBERT 


But B is an n-rowed square matrix with elements in §, and 
B’ = (Bij), = By (i,j = 1,---,@). 


However every B;; is symmetric, and thus we have proved that B’=B’ for 
every B of M, over Fo. We now have the following result: 


THEOREM 34. Let S be an n-rowed symmetric matrix commutative with A. 
Then S is in M, over Fo and is symmetric if and only if it is symmetric in Mz, 
is alternate if and only if it is alternate in M,, and is definite if and only if it is 
definite in M,. 

For S?=S’ and S=S’ if and only if S=S7. If S=(S;,;) and the S;; are in 
%o, then S=S’ and S;;=0 implies that S is alternate. Conversely if S i$ alter- 
nate, we have S=S’, and the S;; are alternate. By the lemma above the 
S;;=0; S is alternate in M,. If S is definite in Mt,, the S;; are the squares of 
quantities of §o and are thus zero or definite; S is definite. Conversely let S 
be a definite n-rowed matrix. Then the principal sub-matrices S;; are either 
zero or definite and at least one S;;~0. By our lemma each S;;=(7;;)?, 74 in 
jo and not all zero, S is definite in M,. 

Consider a symmetric field of matrices R = §[B], B a root of f(x) =0 of 
(127). Then B is similar to A and 


(134) B=PAP, A =diag |G,--- ,G], 


as in (132). Since A and B are symmetric we may apply Lemma 20 to see 
that P’P is in M, over Fo. By Theorem 34 the matrix P’P is definite in M, 
and P’P=(Q’0 where is in 0A = AQ. Lemma 21 then states that B is 
orthogonally equivalent to A, and we have the following theorem: 

THEOREM 35. Any two n-rowed symmetric matrices with the same separable 
irreducible minimum function over § of characteristic two are orthogonally equiv- 
alent in §. 

Theorem 34 may be applied to prove a generalization of Theorem 32. 

THEOREM 36. Let f(x) =c(x™) be irreducible in § of characteristic two, c(x) 
be separable of degree m, and g=2*>1. Then there exists an alternate n-rowed 
square matrix E and an n-rowed matrix B such that 


(135) {(B) = 0, E"B’E = B, 


for every n divisible by 2*m. 

It is clearly sufficient to give the proof for n=2*m. We construct an m- 
rowed symmetric matrix G which is a root of c(x) =0. Use Theorem 32 for 
the field %o, and a=G and obtain an m-rowed square matrix B such that B 


[May 


1938] SYMMETRIC AND ALTERNATE MATRICES 425 


considered as a matrix of 2* rows over § has x“ —G as minimum function. 
Then B*=A, A the matrix of (132), f(B) =0 as desired. Also there exists a 
matrix E such that E-'B/E=B, E’ =E, where E is alternate, J is the involu- 
tion of the algebra of all 2*-rowed matrices over §. By Theorem 34 we have E 
alternate, E-'B’E=B., 

4. Application to primary matrices. Let S be a symmetric primary ma- 
trix. We apply Theorem 31 to prove the existence of a polynomial A = A(S) 
in S such that 


(136) 


where N is nilpotent, c(A) =0. Here the minimum function of S is [c(x) |’ =0; 
N has index e. By an orthogonal transformation we may transform A into 
the form (132). By Theorem 34 both N and S are symmetric over §o, N is 
nilpotent. Then S* =G/J,+N, and the minimum and characteristic functions 
of S over §o are 


(137) (x*—G)*, (x*—G), 


where «™ —G is irreducible in the field §o and S is now a matrix of g=2*f rows. 
Let T be symmetric and similar to S. There is no loss of generality if we 
take A(T), which is similar to A, equal to A. For by Theorem 35 the matrix 


A(T) is orthogonally equivalent to A. But then S and T are matrices with 
elements in §o. We now prove the statement: 


THEOREM 37. The matrices S and T are orthogonal in § if and only if they 
are orthogonal considered as matrices over §o. 


For SA =AS, T=PSP-',so that P’P is commutative with S and hence 
with A. Then P’P is definite, and we have already seen that P’P is a definite 
matrix over §o. Now S and 7 are orthogonally equivalent if and only if 
C'(P’P)C =I for CS=SC. Thus CA =AC, C is a matrix with elements in §o 
and is commutative with S; S and T are orthogonally equivalent when con- 
sidered as matrices over §o. 

We now apply Theorem 30 to see that S is similar in § to a matrix S,, 
where 


(138) 
and 


(139) 


i 
0 
oF 


426 A. A. ALBERT [May 


By Theorem 30 the matrix JN, is nilpotent and has the same indices as S. 
Finally Theorem 37 implies that T is orthogonally equivalent to S over § if 
and only if T is orthogonally equivalent to S over §o. We have thus reduced 
our considerations for primary matrices to the case of matrices with minimum 
function (x*—«a)*, a in our field §. We shall prove that e>1; that is, the fol- 
lowing theorem: 

THEOREM 38. There exist no inseparable fields of symmetric matrices over § 
of characteristic two. 


For let R over § be inseparable. Then there exists a symmetric matrix S 

in & such that S?=A, A asin (132). Then 

S? = diag [G,--- ,G], 
where x?—G is irreducible in §. But S? is definite over § since S?=S’S, S is 
non-singular. By the proof of Lemma 22 the matrix S? is definite over 
o, G= [¢(G) x?—G is reducible in §, and we have a contradiction. 

5. Nilpotent matrices. We shall construct symmetric nilpotent matrices 
with arbitary indices. Let U be defined as in (112), No be the matrix A of 
(111) with a=0. Then Nj is nilpotent of index and order x and is indecom- 
posable. 

The matrix 


Uf(No) = 


ado Gn-3 Gn-2 
for any polynomial f(N) with coefficients a; in §. We may write 


f(x) = filx*) + xfo(x*). 


Then f;(x*) is the square of a polynomial in x with coefficients in § if and only 
if its coefficients are squares in §. We now have the theorem: 


THEOREM 39. Let h(x) =f,(x?) or fo(x”) according as n is odd or even. Then 
Uf(No) is alternate if and only if h(x)=0 and is semi-definite if and only if 
h(x) = [g(x) ]?40. 

For we observe that the only elements on the main diagonal of Uf(No) 
are zeros and the complete set of coefficients of h(x). Thus Uf(No) alternate 
implies that h(x) =0; Uf(No) semi-definite implies that h(x) is a square. 


0 0---0 0 do 
0 0 ---0O ao ay, 
0 O -++ da a, ae 
(140) | 


1938] SYMMETRIC AND ALTERNATE MATRICES 427 


The matrices f(N), Uf(No) are non-singular if and only if a)~0. In par- 
ticular U(J+N>) is non-singular. By Theorem 339 it is definite. We now write 
UI + No) = PP, N = 


Since UNy=N¢ U we have P’PN,=N¢ P'P, P’=N', 
and N’ is symmetric. 

We have thus proved the existence of an indecomposable nilpotent sym- 
metric matrix N of any order n. Every nilpotent matrix is similar to a direct 
sum of indecomposable nilpotent matrices, we form such a direct sum and 
obtain the result: 

THEOREM 40. There exist symmetric nilpotent matrices with any indices. 

6. Primary matrices. We shall require the following lemma: 

Lemma 23. Let a¥0 be in §, i12e2=--- 2e,21, be integers such that 
é:>1. Then there exists a symmetric nilpotent matrix N with e,--- , €: as in- 
dices and a symmetric matrix Q commutative with N such that 


+m) 


is semi-definite. 


For let us take 


N, 0 
O Ne 
where the index of NV; is e:>1. Then we choose V;+0 to have order e; and be 


indecomposable, N, to be the matrix 


PNoP', P’P = U(I + No) 


as in §5. If 


then Q is symmetric and commutative with NV. Moreover (141) is semi-defi- 
nite if and only if 


is semi-definite. The matrix (142) is congruent to 


0 
+ 


\ 
4 
7 
| 
ik 


428 A. A. ALBERT [May 
Now =P'Pf(No) =U(I+No)f(No). If e: is even, 
we take (,=7+WM; and have 
P'f(Ni)P = UI + + No) = UI + Ne) 
alternate by Theorem 39. Also 
P'Oi(a + Ni) P = UI+ No) = U(at + No + 
is semi-definite. But then the matrix (143) congruent to (142) is semi-definite, 
and so is (141). Similarly when e, is odd we take f(N:) =Q,=Ni+N;', 
P'O,P = U(No + + Ni)P = U(NG + Not + aNo + aN’). 
The first of these matrices is alternate, the second is semi-definite since now 


é,23,and N¢ 
We next prove the lemma: 


Lemma 24. Let M be a symmetric matrix with elements in an infinite field §, 
and let Q be symmetric and commutative with M such that 


ox) 


is semi-definite. Then the matrix 
(145) 


is similar to a symmetric matrix B, and there exists a matrix S = S' commutative 
with B such that 


146 
(0 


is semi-definite. 
Since § is infinite we apply Theorem 8 to obtain a quantity )#0 in § such 
that the matrix b‘7+(Q?M is definite. Then 


is semi-definite since (144) is semi-definite. But 


148 = = = | <0, 
(148) =| om om | + | 


so that Q) is a definite symmetric matrix, and we may write 


1938] SYMMETRIC AND ALTERNATE MATRICES 


(149) Qo = Po Po, 

where P, is non-singular. Also 

(150) QoBo = ( 
OM 


by direct computation. Then Pj PoBo=By Po. 
The matrix B=P,B,P;" is now the desired symmetric matrix. Take 
congruent to Sp given by 


(151) 


Then Sy is semi-definite and so is S. The matrix 


(152) SoBo = So ( = Bg So 
10/7. QM 


), Bi Qo = QoBo 


is also semi-definite. Now SB=BS )SoPPoBoP = (P=) "(SoBo) Po 
is congruent to SoBo and is semi-definite. This proves that (146) is semi- 
definite and completes the proof of our lemma. 

We use Lemma 24 to prove a fundamental result: 

THEOREM 41. A primary matrix B with elements in a field § of character- 
istic two is similar in § to a symmetric matrix if and only if the minimum func- 
tion of B is not an irreducible inseparable polynomial. 

For if B has an irreducible inseparable minimum function and By is sym- 
metric and similar to B, the field § [Bo] is inseparable, contrary to Theorem 38. 
Conversely let the minimum function of B be not an irreducible inseparable 
polynomial. Then the argument of §4 together with Theorem 40 reduces the 
proof of our theorem to the question of the existence of a symmetric matrix B 
with arbitrary indices 


(153) 
and with 
(154) (a in §) 


as minimum function, «“—a irreducible in §. By Theorem 40 there exists a 
symmetric nilpotent matrix N with indices ea, - - - , e:. The matrix M =al,;+N 
has the same indices as V, where f=e,+ - - - +e,, and so does 


59 


429 

; 
3 

| 

ia 


430 A. A. ALBERT [May 


by Theorem 30. We use Lemma 24 and choose the symmetric matrix M so 
that there exists an M, similar to (155) and having the further property of 
Lemma 24. By Theorem 30 the matrix M; has the same indices as N. Its char- 
acteristic function is (x?—a)/. An evident induction yields a matrix M, which 
is symmetric and is the desired matrix B. 

A field § which is perfect has the property that there are no inseparable 
polynomials over §. For example every finite field is perfect. But then 
Lemma 11 and Theorem 41 give the following theorem: 


THEOREM 42. Every square matrix with elements in a perfect field § of char- 
acteristic two is similar in § to a symmetric matrix. 


7. Orthogonal equivalence of primary matrices. We have reduced the 
problem of the orthogonal equivalence of primary symmetric matrices to the 
case of matrices with characteristic function (x —a)/, x —a irreducible in §, 
The case a =0 is the case of nilpotent matrices. We may now easily show that 
two matrices may be similar and not orthogonally equivalent. Since the first 
part of our problem is that of the orthogonal equivalence of symmetric nil- 
potent matrices, and since, even in this case, the only criteria are of a neces- 
sarily complicated nature, we shall restrict our attention to the nilpotent 
case. It is the only case occurring when § is perfect. 

Every symmetric nilpotent matrix A with indices ea, - - - , e, is similar in 
to 


(156) N = diag [Ni,--- , Ni], 


where JN; is nilpotent of index and order e; and may be taken to be symmetric 
by Theorem 41. Then NW is symmetric and 


(157) A = LNL-, L'L =Q, 
is commutative with V by Lemma 20. This last condition states that 


(158) = QiuiN; = (4,7 =1,---,4), 


so that in particular the Q;; are polynomials in V;. Also by Lemma 21 if 
(159) Ay = Qo = Li Ly, 


and Ao is similar to A, then A» is orthogonally equivalent to A if and only if 
Qo=C’OQC is congruent to Q under a transformation whose matrix C is com- 
mutative with NV. These are the formal orthogonal equivalence conditions. 

We first study the case where N = N, is indecomposable. By the proof of 
Theorem 40 we may choose WN so that 


(160) P'P=U(I+N.), N= PNoP-, 


1938] SYMMETRIC AND ALTERNATE MATRICES 


with (140) holding. Now 

(161) Q=Q(N), Qo=Q(N), C=C(N) 

are all polynomials in WV and our condition is simply 

(162) Qo = C%”. 

The matrix Q is definite if and only if = =U(I+N,)Q(N_) 
is definite. Write 

(163) S(No) = No)Q(No). 


Since +N, is non-singular there exists a Q(N») for any f(No). But Theorem 
39 gives necessary and sufficient conditions that Uf(No) be definite. Moreover 
if fo(No) =(I+No)Qo, then Ufo(No) may be definite for many polynomials 
Q.#QC?. In fact if f=fit+Nofe, then fC? + NofeC? = fio + N ofa if and only 
if fio=fiC?, foo=feC?. For example, if e=3, we may explicitly compute Uf(No) 
which is definite if and only if a)= be ~0, a2=b/, 


=> (bo + b,N)? + a2No. 
We write c(No) =cotaNotaN?é, [c(No) ]?=c? so that 
[c(No) ]?/(No) [doco + (cob1 + €1bo) No]? + ce a2No. 


Now fo(No) = (do+diN)?+d2N o= [c(No) |?f(No) for some c(No) if and only if 
do boco, d, =Cob, ae. Then and (d,—Cob)) are 
determined. But d2 is at our choice and can be given distinct from cds. 
Notice that in general our first condition is fi) =c?f; which may always have a 
solution c, but that this solution may not satisfy feo=f2c?. 

THEOREM 43. There exist similar indecomposable symmetric nilpotent mat- 
rices which are not orthogonally equivalent. 

Decomposable nilpotent matrices need not be orthogonally decomposable. 
A very simple example may be obtained as follows. Let 


The matrix A is a nilpotent matrix of index two in § of characteristic two and 


is similar in § to 
N 0 11 
0 N 11 


If A were orthogonally equivalent to a direct sum, the components would be 
necessarily nilpotent of index two, hence A would be orthogonal to 


431 

+ 

| 


432 A. A. ALBERT 


(164) an)’ 


a, b not zero and in §. For the only two-rowed nilpotent symmetric matrices 
are multiples of the N above, and A has rank two, ab +0. But (164) is non- 
alternate, while A is alternate and cannot even be congruent to (164). 

8. Reduction to primary components. Consider first the question of re- 
ducing a J-symmetric matrix to primary components. We let § be an arbi- 
trary field and let J be an involution defined by 


(165) AY = E"A’E, 

where E’=cE is non-singular, e= +1. Suppose that 
(166) AJ = 

and that 


G 0 
(167) A P( 
0H 


where G and H are relatively prime according to our definition. Then 
EA! =A'E=6EA, 


0 H’ 


G 0 ‘0 
(168) | ( ) 
0H 


so that 


(169) ) 


By Lemma 12 we have 


(170) P'EP E{ Ei = 
~\o 1 = €£1, 2 = €£2. 
Also E,\G =6G’E,, E:H =6H'E2, so that we have the first part of the following 
theorem: 

THEOREM 44. Let E’ AJ = E-'A’E=56A, where «= +1, +1 and the 
square matrix A has relatively prime components G and H. Then 


E, 0 


G 0 
(171) A - pep ), Ei Ez = 


2 


and the matrices The components G 
and H are not unique, but if 


[May 
-( 
= ( ) PEP. 
0 6H’ 


1938] SYMMETRIC AND ALTERNATE MATRICES 


(172) 


then Go=Qr'GQ;, Ho=Qzr'HQze, and the replacement of G and H by similar 
matrices replaces E, and E2 by congruent matrices such that 


QiEQ, 0 ) 
0 


The last part of the above is due to Lemma 12 and 


0 O G 0 Q, 0 
(174) )=( ) Py = P( ). 
0 Ho 0H 0 Qe 


We now let A» be a matrix similar to A and such that A,’ =«€A>. Then 


@ L; 0 
(175) Ao = R( R'ER = ( ). 
0 Ao 0 Ls 


If Ap>=DAD-", where then 


0H "NO Ay 


so that 


(173) P{ EP, = ( 


Ci 0 
(177) DP = ), Hy 
2 


Moreover E-!D’/ED=I, 


E, 0 CiLic, 
(178) (DP)'E(DP) = P’EP = ( ) = ( ). 
0 0 Ci LL: 


Hence necessarily Z; and E, are congruent; LZ, and E, are congruent. Hence 
we may replace Gp and H, by similar matrices and take E,=L;, E,=L2. We 
may now prove the following theorem: 

THEOREM 45. Let Ao’ =65Ao be similar to A of Theorem 44. Then Ag is 
J-orthogonally equivalent to A only if 


Go 0 E, 0 
(179) Ao = R( RER= ( 
0 Ho 0 Ee 


so that =5Go, Moreover Ay is J-orthogonally equivalent to A if 
and only if Go is J;-orthogonally equivalent to G, and Ho is J2-orthogonally 
equivalent to H. 


433 | 

G 0 | 

- 
0 Ay 

| 

q 


434 A. A. ALBERT [May 


This theorem reduces the problem of J-orthogonal equivalence to the 
similar problem for primary matrices. For proof we need only take Z,= &,, 
L, = E, above and obtain C/ £,C, = and similarly C,’C, = Thus 
Go=CiGCy, Ho as desired. 

We apply Theorem 45 to our case of orthogonal equivalence in § of char- 
acteristic two. Suppose that A =A’ has components G and H. Then 


E, 0 
(180) P'IP = ( ). 
0 Es 


One of E; and E, must be definite by Theorem 7. Assume that £, is definite 
so that we may take E,=/, to be an identity matrix. Then two cases arise. 
In the first case E, is definite; we may take E,=J; and have P’P=1/; P is an 
orthogonal matrix and A is orthogonally equivalent to 


(0 


where both G and H are symmetric. Moreover 


0 
Ay = Pe'Po = I, 


is orthogonally equivalent to A if and only if G and G» are orthogonally 
equivalent, and H and H, are orthogonally equivalent. 

We next consider the only remaining case, that where £; is an alternate 
matrix. Then H has even order 2s, and we may take 


Ey, = 


The matrix H is J2-symmetric by Theorem 42, and 
= Ex'H’'Es. 


However G is symmetric. Finally if A» is similar to A, then necessarily 


0 0 Hy 0 > 0 0 0 


if Ay is to be orthogonally equivalent to A. We apply this result to prove the 
following theorem: 

THEOREM 46. There exist symmetric matrices A which have symmetric rela- 
tively prime components G, H and are such that A is not orthogonally equivalent 
to any direct sum 


0 
I, 0 


1938] SYMMETRIC AND ALTERNATE MATRICES 


for Go similar to G, Ho similar to H. 


For let us construct any symmetric matrix G* whose characteristic func- 
tion is prime to that of H, where 


Then Ez'H’E,=H by direct computation. Also 


0 


is definite. Hence 


is symmetric and has G and H as components. But if A were orthogonally 
equivalent to (181), we could apply Theorem 44 to see that E2 is congruent to 
the identity matrix J2,. But this is impossible since E, is alternate. 

We now use the form of (180) to prove the theorem: 


THEOREM 47. A matrix A with elements in a field § of characteristic two is 
similar to a symmetric matrix if and only if the minimum function of A is not 
a product of distinct inseparable irreducible polynomials. 


For every symmetric A is similar to a direct sum of its relatively prime 
primary components. By (180) and Theorem 45 one of these primary com- 
ponents B; is similar to a symmetric matrix. But then Theorem 41 implies 
that the minimum function of B;, which is a factor of that of A, is not an in- 
separable irreducible polynomial. Conversely let A have minimum function 


f(x) = g(x)- A(x), 


where g(x) and h(x) are relatively prime, /(x) is the product of all irreducible 
inseparable factors of f(x) whose second power does not divide f(x). By Theo- 
rem 41 corresponding to each distinct power [g;(x) ]* of an irreducible poly- 
nomial occurring in the factorization of g(x) into such factors there corre- 
sponds a symmetric matrix G; which is a primary component of A. Their 
direct sum is a symmetric component G of A. By Theorem 36 correspond- 


* In particular we might take G to define a symmetric separable field, S nilpotent. 


435 
| 
G 0 
A= P( 

0H 


436 A. A. ALBERT 


ing to each irreducible factor h;(«) of h(x) there is an indecomposable matrix 
H; with h,(x) as characteristic function and such that £; is alternate, 
Ez} E;=H,;. The direct sum of the H; is a matrix H which is a component 
of A such that A is similar to 


G 0 
( ), G=G', E"H’E=E, 
0H 


where E is the alternate matrix which is the direct sum of the £;. But 


I 0 G 0 
0H 


is symmetric by Theorem 44 and is similar to A. 


Unrversity oF CHICAGO, 
Curcaco, 


GENERALIZED INTEGRALS AND DIFFERENTIAL 
EQUATIONS* 


BY 
HANS LEWY 


Introduction. The idea of the following considerations can best be ex- 
plained in the simplest case of an integral 


f ax, ab(a, f(x)) 


in which a and 6 are continuously differentiable functions of two variables, 
f(x) a continuously differentiable function. The routine estimate of fadb gives 
bounds depending either on /|df(x)| or on the maximum of |df/dx| in the 
interval of integration. It is, however, possible, as shown in Theorem 1, to 
give a bound that is entirely independent of the derivative of f(«), and, con- 
sequently, to define, by a limiting process, fadb, even in the case where f(x) 
not only has no derivative, but is no longer continuous, provided f(x) belongs 
to Baire’s first class. The same observation holds for a great number of func- 
tionals of f(x) whose construction depends on the derivative of f(x), but 
for which bounds can be found nevertheless without reference to df/dx. In 
this paper we are concerned mainly with ordinary differential equations 
(Theorems 2-3’) and systems of hyperbolic equations in two independent 
variables (Theorem 6 and corollaries) whose treatment is based on a detailed 
study of the double integral (33). 

1. Simple integrals. The theory of the Lebesgue-Stieltjes integral con- 
tains the following statement: If in a closed interval J the function Bo(x) 
is monotone and the sequence of continuous functions a,(x) is uniformly 
bounded and tends to a limit function a(x), then the Stieltjes integral 
Sc,(x)dBo(x) tends to the Lebesgue-Stieltjes integral fao(x)dBo(x) as 
If, furthermore, a sequence of functions 8,(x) of bounded variation tends to 
B(x) as uy so that the total variation of the difference 8,(x) —Bo(x) tends 
to zero, then 


lim sup f ao( x) dBo(x) 


< lim sup {max | a(x) | - f | d(B, — Bo) | — 0, 


* Presented to the Society, November 28, 1936; received by the editors May 20, 1936, and May 
24, 1937. 


437 


4 


438 HANS LEWY [May 


which relation leads to the following lemma: 


Lemna 1. Jf in the interval Xo>Sx=X the continuous functions a,(x), 
w=1,2,--- , are uniformly bounded and tend to ao(x) as p> , and tf the func- 
tions B,(x) of bounded variation tend to a function Bo(x) of bounded variation 
while the total variation of B,(x) —Bo(x) tends to zero as p>, then 


z 


lim a,(x)dB,(x) exists and = 


Xo Xo 
where the integral on the right is the Lebesgue-Stieltjes integral. 


We proceed to introduce a new notion of integral which is essentially 
different from the Lebesgue-Stieltjes integral and is based upon the follow- 
ing theorem: 

THEOREM 1. Jn the interval J (Xo Sx <.X;) there are given a function f(x) 
satisfying the inequality | f(x) —f(Xo)| <F, a sequence {f,(x)} of continuously 
differentiable functions with | f,(x) —f(Xo)| <F such that f,(x)—f(x) as p>, 
and n continuous functions gi(x),---, gn(x) with bounded total variations 
T [gi], ---,T [gn]. Denote by f°, g?,--- , the values of f(Xo), g(Xo), 
&n(Xo) and by , upper bounds of | gi(x)—g|,---, |gn(x)—gP| in 
J. Let {gu(x)},---, fenu(x)} be m sequences of continuous functions of 
bounded variation, tending to gi(x), - - - , gn(x), respectively, as u>~© while the 
total variations of the differences T[g:,—g:], - - - tend to zero and 
| gin(x) —g?| for all x in J, i=1, 2,---, and 2,---. Suppose 


that in the (n+2)-dimensional domain 
D 


the functions a(z, x, V1, -- , and b(z, v1, - Yn) are continuously dif- 
ferentiable. Then the Stieltjes integral 


tends to a limit L(x) as p>. 

Remarks. It is clear that any function f(x) with | f(x) —f°| <F which is a 
limit of continuous functions may be considered as a limit of continuously 
differentiable functions f,(x) with | f,(x) —f°| <F. Moreover, the limit L(x) 
is independent of the approximating sequences f,(~) and g;,(x). For the state- 
ment of Theorem 1 implies the existence of a limit no matter which sequences 
are used and hence any two sequences may be considered as subsequences of 


1938] GENERALIZED INTEGRALS 439 


a third one containing both. Consequently we may state the following as a 
definition: 


DEFINITION. 


In order to prove Theorem 1, we first assume that b(z, x, 91, - - - , yn) has 
continuous derivatives of second order of the mixed type. We determine a 
function A(z, x, v1, - - - , Yn) in Das the solution of the differential équation 


0A 0b 
—_ =a— 
02 
with the initial condition A(f°, x, ¥1, - - - , Yn) =0. We obtain 
A(z, Yn) 


where we may differentiate with respect to x, y:, - - - , y, under the integral 
sign. Thus we find 


adb = ab,dz + ab,dx + > aby.dy; 
t=] 


dA + (ab, — A,)dx + (aby, — Ay,)dyi, 
A (2, = (ab.) -dz’ 


+ (a2b, — a,bz) |dz’ 


and 


— a(f*, 91, °° % >) 
+ f (ay,bs — a;b,,)d2’. 


n 
| 
| 


HANS LEWY 


i=1 


z Su(2’) 
+ av’ [a.(2’, 2’, g(x’), )b.(2’, 2’, ) 


z 
0 

This formula, derived under the assumption that } has continuous second 
derivatives of mixed type, still holds under the conditions of Theorem 1. For 
any b which is continuously differentiable in D may be uniformly approxi- 
mated by a polynomial such that its first derivatives uniformly approximate 
those of 6. Introducing the approximations instead of } into (2) and passing 
to the limit we obtain again the formula (2) as both sides of (2) involve only 
first derivatives of b and the passage to the limit under the integral signs is 
legitimate in view of the uniform convergence of the derivatives of the poly- 
nomials to those of b. 

In the right-hand member of (2) we can effect the passage uo by 
simply cancelling all reference to u. This may be seen as follows. An integral 
x, ¥1, Yn)by,dz’, for instance, is continuous in 2, x, yi, , Vn. 
Thus x, gi.(x), - - - )by,dz’ is a continuous function of x, bounded 
as uw, and converging as uy . Now the convergence of 


z Su(2) 
f dgulx’) x, Yous! 
Xo 


follows from Lemma 1. 
Thus Theorem 1 is proved. 
From (2) we have the following estimate: 


i=1 


440 [May 
Hence 


1938] GENERALIZED INTEGRALS 441 


where M is an upper bound for | a| and ||, N an upper bound for the moduli 
of the first derivatives of a and 6 in D. 
Remark. We have, for instance, for every admissible f(x) 


L(x) = = — PO), 


which leads to L(1) =} for f(x) =0 if O<x <1, f(1) =1. The Lebesgue-Stieltjes 
integral, however, would be 1. 

2. Ordinary differential equations. We may now prove the following 
theorem: 


THEOREM 2. Let the functions f(x), gi(x), --- , gn(x) be continuously dif- 
ferentiable in 0<x%<X and f(0) =f, g:(0) = - - - =g,(0) =0. Denote by F an 
upper bound of |\f(x)| and by Gi, Gz, ---, the total variations of g:(x), 
go(x), in [0, X]. Suppose, for that in the domain 


Ds: |2| SF, | SG, | ul <e+3FMo+ — + 
i=1 

the functions do, a1, - - - , Gn are continuous functions of 2, yi, u satisfying the 

inequalities |ax,| k=0, 1, - - - , n, and that ay has continuous derivatives 

of first order, bounded in absolute value by N. Then the solution u(x) of the equa- 

tion 


du(x) ao( f(x), g(x), £n(x), u(x))df(x) 
+ ¥ oA fls), g:lz), gals), 


t=1 


(E) 


with u(0) =0 can be extended over the whole interval OS x= X. It satisfies, more- 
over, the inequality 


| u(x) | < 2FMo+ > — eFN + MyeF%)G;. 
t=1 


Let us solve the auxiliary partial equation for A(z, 1, - - - , Yn, %) 
0A 0A 

(3) ene 

under the initial condition 

(4) A(z, yi, ¥) = 0 for z = 0. 

The characteristic equations of (3) are 


(5) dz:du:dA = 1:d9:do. 


442 HANS LEWY [May 


Consider the family C of curves satisfying the differential equation du/dz =o 
and passing through any point P of the domain 
Do: <F, | |u| — + MieP’)G;. 


i=1 
Since do has continuous derivatives of first order throughout D2, and is 


bounded by Mo, there exists one and only one curve through an arbitrary 
point P, and the corresponding value # of u for z=0 lies within the range 


| <e+3FMo+ > — + 
i=] 

Conversely, an arbitrary curve of C is uniquely determined by the quanti- 
ties 7, V1, - - - , Yn, and a point of the curve is determined by giving in addi- 
tion the corresponding value of z. On writing u=u(z, yi, #), we find bounds 
for the derivatives of u with respect to the arguments @, 1, - - - , Yn, 3. We 
have, in fact, 


d Odo Ou 
dz Ou Ou On 


with 


u 
< < ev lel, 
Similarly 
d ou Odo Odo 
dz = OY; Ou 


—| < — 1, 


| ou 
Ov; 


On introducing the new variables z, v1, - ++, Yn, # instead of 2, y1,---, 
yn, we find throughout 


A(z, Vis u) Ou A(z, 79) ou (~) en < on < 
On Ou 


(6) 


—| < — 4), 
Ov: OY: Ou OY; 


Ou Ou ou Ou | 


Since @ is constant along any curve of C, we have 


Ou 
On 
hence 
hence 


GENERALIZED INTEGRALS 


Now put, throughout D,.., 
(7) A(z, yi, = u— a. 


Evidently A satisfies (3) and (4). Furthermore we have 


0A 
ao, F 


| < 1), 


(8) 
0A oA 


Ou 


Returning to the ordinary differential equation (E), we remark that the 
conditions of our theorem allow us to write (E) in the form 
du 


de u) 


with ¢(x, «) continuous in the rectangle R determined by 


0<«<X, |u| + H, H = > — + 


t=1 


The fundamental existence theorem for differential equations shows that a 
solution through [x=0, u(0) =0] always may be continued until it reaches 
the boundary of R. Hence a solution which cannot be continued across a cer- 
tain point x, with 0<2,<X may be assumed to exist for 0<*<a and to 
satisfy the condition 


| w(a1)| = + 2FMy + H. 


Thus our theorem is proved as soon as we show the following property of 
u(x). If, in the interval O<Sx<m, the solution u(x) of (E) satisfies the in- 
equality 


€ 
| u(x) | + 2FMo +H, 
it satisfies the stronger inequality 


| u(x) | < 2FM,+ H. 


Indeed, by (E) we have, since s=f(x), y:=g;(x), u=u(x) stay in De,., 


1938] 443 

Ou Ou Ou 

—+a— =0, — | = 

Oz ou 


HANS LEWY 


d(w — A) = (as — + 32 (as — — Aude 
(9) 
= (do — Az — Audo)df(x) + (ai(1 — Au) — Ay,)dgi(x), 
with z=f(x), yi=g.(x) and u(0) =0. Hence we conclude from (3) and (8) 
(10) | u(x) — A(f(x), gi(x), u(x)) | S + H, 
and, by (8), 
(11) | w(x)| < +H. 


Remarks. The function u(z, y;, %) is continuous. This may be expressed by 
the statement: If the quantities z, y;, «—A(z, y:, u) tend to limiting values 
which belong to the domain |z| <F, |y.| <Gi, |w—A(z, yi, u)| S<H+FMo, 
then u itself tends to a limiting value which in absolute value does not exceed 
2FM,+H. 

Any function satisfying a Lipschitz condition of exponent 1 in z, y;, « sat- 
isfies also a Lipschitz condition of exponent 1 in the variables z, yi, w—A. 
This follows from (8), for we have 


(12) | u, — A(z, Yi, — te + A(z, yi, us) | = | — u;| 


THEOREM 3. Assume that in the interval OS 5X 

(i) the functions f,(x),u=1,2, - - - , are continuously differentiable, | f,(x) | 

<F, and f,(x) converges to a function f(x) as p> © ; 

(ii) the functions gi,(x),-- +, w=1, 2, --- , are continuously dif- 
ferentiable, gi,(x)—gi(x) as i=1, 2, -- - , n, where g;(x) is continuous, 
and the total variations of the differences T |gi,—gi| tend to zero as p> © ; further- 
more gi,(0) =0 and T <G; for all i and 

(iii) the functions ao, a1, - - + , dn are defined in 


lei <F, |u| <e+3FM+4, 
Dat 


H = — + MieP®)G;, 


i=1 


and we have in D3, 
(13) | ao] < Mo, | a:| S Mi,---, | an| S Mn. 


Furthermore, ao, 41, - - + , Gn have continuous derivatives of first order in D;,., 
and those of ao are bounded in absolute value by N and satisfy a Lipschitz con- 
dition of exponent 1. 

Then the solution of 


444 [May 
(e > 0) 


1938] GENERALIZED INTEGRALS 


(E,) = tuls), + Sedecls), = 0, 


exists for OS satisfies 
(14) | u(x) | < 2FMo + H, 


and tends to a limit function u(x) as p> . u(x) is said to be the solution of (E) 
for the initial condition u(0) =0. 

From Theorem 2 we conclude the existence of u,(x) and the inequality 
(14). The classical statement about uniqueness of the solution of the initial 
problem may, incidentally, be used to establish the uniqueness of u,(x). On 
putting 

= u,(*%) — A(f,(%), gin), 


we conclude from (9) and (8) 


n 


(15) | Bylo) Balas) | + — |. 

im1 
Now a theorem by Adams and Clarksonj{ shows that the total variation be- 
tween any two points x; and 2s, of g:,(x) tends uniformly to that of g;(x) 
on account of the continuity of g;(x), and of the assumption (iii) that 
T |gi.—gi]—0, gi,—g:. Thus (15) establishes equicontinuity for B,(x), while 
(14) gives boundedness. Hence, by Ascoli’s theorem, we may select a sub- 
sequence B,-(x) tending uniformly to a function B*(x). From the remark on 
page 444 we conclude that also the corresponding subsequence of u,(x), say 
u,(x), converges to a function u*(x). B*(x) satisfies the following integral 


equation 


(a(t — Aa) — Ay,)dgi(x) — 0,0, 


z 
0 i=l 


(16) B*(x) = 


in which the expressions in a; and A are to be considered as functions of 
2, Vi, u with z=f(x), = B*(x). This follows from Lemma 1 and 
(9). 

Any two subsequences of B*(x) converge to the same limit. In the opposite 
case we would have two functions B*(x) and B**(x), both satisfying (16). 
In view of (iii) the coefficients do, a1, - - - , @, admit of continuous derivatives 
with respect to z, y;, u, whence also with respect to z, yi, w—A, and thus 
satisfy a Lipschitz condition of exponent 1 in these variables, in the closed 
domain |z| <F, | y;| <Gi, |«—A| On account of (3) and (4), the 

+ C. R. Adams and J. A. Clarkson, On convergence in variation, Bulletin of the American Mathe- 
matical Society, vol. 40 (1934), pp. 413-417. 


445 


446 HANS LEWY [May 


derivatives A, and A, satisfy a Lipschitz condition with respect to z, yi, u, 
whence with respect to z, y;, «—A. Therefore we conclude from (16) 


(17) | B(x) — BY(x)| f B*(x) — B**(x)|| dg.(x) |, 


where K is a certain constant. On iterating (17) m times we easily find 
(18) max | B*(x) — B**(x) | < K™ max | B*(x) — B**(x) | ( DG.) J 
0Sz2SX rat 


which gives B*(x) —B**(x) =0 as 

Now the uniqueness of B*(x) implies, by the remark on page 444, the 
uniqueness of u*(x), which in turn justifies defining u(x) =u*(x) as the solu- 
tion of (E) for the initial condition (0) =0. 

Remark. The assumptions of Theorem 3 state bounds for the functions 
@o, 41, , Qn(Z, V1, Yn, holding in a domain that depends on these 
same bounds. One may ask for a formulation of the theorem such that a 
statement results for any functions do, a, - - - , dn, defined in an arbitrary 
neighborhood of the origin. Therefore we observe that there always exists a 
sub-neighborhood of the form D;,., provided the constants F, Gi, - - - , Gn, 
e can be decreased sufficiently. Since the functions gi,(x), - - - , Zn,(") are con- 
tinuous, their total variations are also continuous and may be shownf to 
converge uniformly to those of gi(x),-- - , gn(x). Thus, omitting at most a 
finite number of values of uw and taking the upper end of the x-interval suffi- 
ciently small, makes it possible to choose G;, Ge, - - - , G, arbitrarily small. 
Whence we conclude the following theorem: 


THEOREM 3’. Suppose that, in a neighborhood of the origin of a (z,y1,---, 
Yn, u)-space the functions do, a1, - - + , dn are continuously differentiable and the 
derivatives of dy satisfy a Lipschitz condition of exponent 1. Assume that n con- 
tinuously differentiable functions gi,(x), - - - , Zny(x) are defined in an interval I 
(0<x<X), that they tend to continuous functions gi(x),---, gn(x), and the 
total variations T [gi,—gi|—0 as u->~, and that g;,(0) =0. Assume furthermore 
that continuously differentiable functions f,(x) converge to a function f(x) in I 
and that for all 4 we have | f,(x)| <F. Then the solution u,(x) of (E,) exists in a 
sufficiently small interval 0S x X' which does not depend on pw, and converges 
there to a limit function u(x) provided F is sufficiently small. 

The existence Theorems 3 and 3’ can be supplemented by a study of the 


manner in which the solution u(«) depends on a parameter a on which the 
known functions in (E) may be supposed to depend. Usual methods of proving 


+ See Adams and Clarkson, loc. cit. 


1938] GENERALIZED INTEGRALS 447 


the continuity of u(«), considered as a function of x and a, from that of the 
known functions could be carried through with only slight modifications. 
Instead of (E) a system of differential equations of the form 


(19) dun(x) = aon(f(x), g(x), ma(x))df(x) + aindgi(x), = 0, 

i=1 
where h=1, 2,---,m, may be studied, and the existence of a solution 
u,(x), h=1, 2,---+,m, can be concluded by a method analogous to that 
used in the proofs of Theorems 2 and 3. In view of the similarity of the pro- 
cedure we shall not carry out these generalizations. 

3. Double integrals. We denote by T[a, 8] a triangle bounded by the 
line a=6 of an (a, 8)-plane and the parallels to the axes through (a, 8). Simi- 
larly ¢[f, g] designates a triangle of an (f, g)-plane, bounded by f=g and the 
parallels to the f- and g-axes through (f, g). The elements of area dadB and 
dfdg are to be counted positive. By f(a) and g(8) we understand continuously 
differentiable mappings of the a-axis on the f-axis and of the 8-axis on the 
g-axis, which are, but for the elements used, identical with each other; 
f(a) =g(8) if a=8. The range of the four variables a, 8, f, g is the domain D 
with origin as center 


(D) lal sw, so, So, |g| So, 


and the function f(a) (and consequently g(8)) is such that for a and 8 satisfy- 
ing (D) the point (a, f(a), 8, g(8)) belongs to D. Furthermore there are defined 
in D three functions a, b, c of a, f, 8, g having continuous derivatives up to 
the fourth order. 

We introduce three functions X, Y, Z in D by the relations 
(20) = 


(21) VY; = = Caab,, 


and the initial conditions 
(22) x 
4 


(23) 


Here, as in all integrals that follow, care has been taken to indicate the argu- 


= 0, | if f=g. 
We find 


448 HANS LEWY [May 


ments of the integrand at least in one of the factors of the integrand, to de- 
note the variables of integration by a prime and to denote by subscripts the 
partial derivatives, while we reserve the symbols d/da and d/d@ for total de- 
rivatives with respect to these variables. 

We are going to study the function 


I (a, f, B, g) X(a, f, B, g) fee, f(a’), B, g) Z) da!’ 
8 
(24) + f (Xela, f, B’, g(8’)) — 
B 


In order to abbreviate as much as possible, we write A ~~B if A —B is expres- 

sible as a polynomial in a, 6, c, their first partial derivatives and their second 

partial derivatives of the type 0?/dad8, 0?/dadg, 0?/dBdf, 0?/dfdg. We write 

A = Bif A —Bis expressible as an integral over a function which itself is ~ 0. 
Thus we find 


B 


X,- (Xag(a’, f(a’), B, g) — Z,) da’ , 


Xa(a, f, B, g) — Xala, f(a), B, g) + Z(a, f(a), B, g) 
+ Xa(a, f, a, f(a)) — Y(a, f, a, f(e)) 


+f (- X ap + Zs + Caabs(a, f(@), 6’, g(8’)))dp’, 
8B 


+ f "(Xes(ay, f, B’, — 
8 


Xa(a, B, g) Xa(a, f, B, g(8)) + B, g(8)) 
+ X.(8, g(8), B, g) = Z(8, g(8), B, g) 


7 J f(a’), B, g) Zs) dal’ 
8 


f° Xa + 25+ Va casbsla’, fle’), B, 6(8)))de’, 
8 


(26) 


GENERALIZED INTEGRALS 
= cayzb,, 
Tyg = f, B, — Xse(a, f, B, g(B)) + f, B, g(8)), 
Toa = X gala, f, B, g) — Xoala, f(a), B, g) +Zo(a, f(a), B, g), 
Tas = Xas(a, f, g) — Xas(a, f(a), B, g) + Ze(a, f(x), B, g) 
— Zs(a, f(a), B, g(8)) + Xas(a, f(a), 8, g(8)) — Xas(a, f, B, g(8)) 
+ Ya(a, f, 8, g(8)) — Yala, f(a), 8, g(8)) + caabs(a, f(a), B, g(8)). 


(29) 


g 
f [(casby)s — (casbs), |dg’ + cazbg 
9 (8) 


g 
= f — (cas) obp|dg’ + caybs, 
9(B) 


(30) ~ cayzbg. 
Similarly 
(31) St 


Moreover, 


Xas(a, J, B, g) X f(a), B, g) + X as(a, f(@), B, g(8)) 
f 
X as(a, B, g(8)) = X aps o(@, fi B, g')df'dg’. 


f(a)“ 9(B) 
X = (Cazbg) ap = (Cadsbg + + cazbga)s 
Cadsb og + + + CAzab 98 + CAs) gas 
(Cadsbp) g + + (Caasby)s + (casbas)g + 
= — — gaz 
(caabs) of — [(caa) — (cydabs)g — (Caabsz) o- 


Hence, 


f g 
f X ass Ss B, g)df'dg’ =~ Caabg(a, B, g) Caabg(ar, f(a), B, g) 
f(a)“ 9(8) 


+ Caabs(a, f(a), B, g(8)) Caabs(a, f, B, g(B)) 


1938] 
| 
Hence 
| 


450 HANS LEWY 


Furthermore, 


Zale, f(a), B, Zola, f(a), 8, e(8))= fla), B, 


(8) 


f g(a, f(a), B, lade’ 


(8) 


[caabg(a, f(a), B, g’) 
(8) 


™caabs(a, f(a), B, g) f(a), B, g(8)). 
Similarly, 
Y B, g(8)) Y .(a, f(a), B, g(8))=caabs(a, f, B, g(8)) — Ca f(a), B, g(8)) 


Finally 
(32) T.a(@, f, B, g)=caabs(a, B, g) 


In view of (29), we find 
d?J df(a dg(p df(a) dg(B) 
dadB (a, f(a), B, g(8)) = caabg + cazbg + + caysb, 
dala, f(a), B, ab(a, (a), B, 8(6)) 


da 


I(a, f(a), B, g(8)) = 9, 


for Thus 

da db 
(33) f(a), B, g(8)) = — f f a’, f(a’), da’ dB. 
T da dg 


In order to transform I;(a, f, 8, g) we calculate 


a a (8’) 
f Xaj(a, B’, g(8’))dB’ f as'| X pz ola, f, 6’, g’ dg’ + Xas(a, f, | 
8 8 f 


i) as’ (casb g(a, f, B’, g))adg’ 
8 f 


a g(8’) 
~ f dp’ f (casbg(ay, f, B’, ode" 
B f 


= f 6’, g(8’)) casbg(a, f, 6’, 


[May 
dg 

and 

dI dI 

- - 0, 
da dg 


1938] GENERALIZED INTEGRALS 


since, by (22), Xa;(a, f, 8’, f) vanishes. 
Consequently 


(34) I,(a, f, B, g) 
Similarly 
(35) I f, B, g) +0. 


Also we have 


X.(a, f, 8,4) = — f f S's Br 
t{f.a) 


where the integration is to be extended over the boundary of ¢[f, g]; whence 
X.(a, f, B, g) 0, Xa(a, f, B, g) 

Thus 

(36) Ta(a, f, a, g) = 0, 


as obviously Z~0. 
On writing 


8 
Tala, fy B, 8) = Tala, f, 8) + f, 6%, #48", 


we find by (36) and (32) 

(37) T.(a, f, 8, g) = 0. 
Similarly, 

(38) Ts(a, f, B, g) = 0. 
Finally 


8 
Ia, f, 8, = Wa, f, 8) + [Iola f, 6°, 


39 
I(a, f, B, 


The formulas (29), (30), (31), (32), (34), (35), (37), (38), and (39) prove 
that I(a, f, 8, g), its first derivatives and its second derivatives of the type 


451 


452 HANS LEWY [May 


0?/dadB, 0?/dBAf, A?/dfdg may be expressed in terms of a, b, c, their 
first and second derivatives of the same type, and integrals over products of 
such functions. 

Henceforth the definition (24) of I is to be replaced by the explicit formula 
whose abbreviated equivalent is (39), and which retains sense in the case that 
a, b, ¢ admit only of continuous first derivatives and of continuous second 
derivatives of the indicated type. If we uniformly approximate a, b, c and 
said derivatives by polynomials in a, f, 8, g and their respective derivatives, 
we may easily see that all of the formulas (29)—(32), (34), (35), (37)—(39) re- 
main valid under the new assumptions and that (a, f, 8, g) still retains con- 
tinuous first and second derivatives of said type. 

Moreover a study of the dependence of J on the function f(a) shows that 
convergence of a sequence of continuously differentiable functions f,(a) to a 
limit function fo(a) implies the convergence of the corresponding functionals 
I,(a, f, 8, g) to a limit functional Jo(a, f, 8, g), and uniform convergence of 
f(a) to fo(a) entails uniform convergence of J, to Io. In fact, f,(a@) appears in 
the definition of J,(a, f, 8, g) only in limits of integration with respect to f’ 
or g’, which implies the convergence mentioned of I, to I. 

From the formulas (29)—(32), (34), (35), (37)-(39) we can derive esti- 
mates for I(a, f, B, g) and its derivatives. Suppose first that in |a|, |f|, |8|, 
|g| <a 


|o|, |c| K, |ae|,---, SK’, | aee|,---, | SK”. 


The terms suppressed in the above formulas by the use of the symbol ~ are 
simple, double, triple, and quadruple integrals of polynomials of third degree 
in a, b,c, da, - , With ranges of integration, respectively, <pw, pw?, pw*, 
pw*, where p denotes a sufficiently large number, for instance, 64. On the other 
hand, any one of the polynomials to be integrated is numerically smaller than 
a suitable polynomial in K, K’, K’’ of third degree with positive coefficients. 
Hence there exists a polynomial p with positive coefficients and of third de- 
gree in K, K’, K’’, such that for any one of formulas (29), - - - , (39) the terms 
suppressed by the symbol ~ are numerically less than or equal to 


P(K, K’, K")(w + w + w + 

We apply these estimates to the case where a, b, c depend only indirectly 
on a, f, 8, g and are functions of y, - - - , ¥n(a@, f, 8, g) having third deriva- 
tives with respect to yi, - - - , We assume that in |a|, |8], |g| Se, 
the following quantities exist and satisfy 
(40) lyil,---, S&, 

(41) | viel, | ¥no| k’, | vies |, | k”, (R, k’, k” > 0), 


1938] GENERALIZED INTEGRALS 


and that for y; satisfying (40) 
| a |, | b |; | c| 
da 0b dc 


avi | api l’ | 
| 


| | | 


We then write 
a, f, B, 


In order to utilize the estimates found, it is legitimate to replace K by L, 
K’' by nL’k’, K" by nLk"’ +n?L"k’?. Thus from (29), - - - , (39) we obtain 


a, f, B, g 


(43) (w+ w + + L’, L”, k, kh’, 
S + (w+ w? + + w)g(L,--- Rk”), 


T(a, f, 8, 8) = r( 


where q is a suitable polynomial with positive coefficients. 
We next state bounds for the difference between 


B, *) B, ‘) 
y’ 


and its derivatives, assuming yj to be another system of continuously differ- 
entiable functions of a, f,8, g which satisfy the same inequalities (40), (41) 
as the y; themselves do. For the sake of simplicity we suppose w <Q, with 
Q>0 and denote by u and v 


u = > max | — + Domax | pie — + + Domax | — viol, 


i=1 t=1 i=1 


v= > max | Viap = Vis | + > max | Viag aed Vie | 


i=1 


+ max | — + max | — 


i=1 


the maxima to be taken for the domain |a], |f|, ||, |g| <w. 


453 
s 
= 
| < 
| < 
n 
n 


454 HANS LEWY 


We find by a procedure similar to the one previously used, 


y /| 


with C depending on Q, k, k’, k’’, L, L’, L’’, L’’’. In an analogous way the 
difference between 


may be estimated under the assumption that the functions used in the forma- 


tion of J, namely 4, 5, é(¥:, - - - , ¥.), and their first and second derivatives, 
differ by less than ¢ from those used for 


Qa, »B, a, 


We now are able to formulate the main theorem of this section. Denote by 


We get 


Ta 


Tap 


[May 
(44.2) | uC + Ww, 
| uC + Cw, 
(45) | 


GENERALIZED INTEGRALS 
Tin IB, *) the functional I IB, ‘) 
formed for 
a= yj, b=, C = Cijt. 

THEOREM 4. Denote by w>0 a number <Q and by D the domain |a\, |f]\, 
18], |g| Sw. Suppose cij: to be continuously differentiable up to the third order 
with respect to its arguments Yr, - - - , Yn, and that for y; in (40), a=y;, b=, 
c¢=Cij the relations (42) hold. Suppose furthermore that in D the functions 


V2 (a, f, B, g) are continuously differentiable and admit of continuous second 
derivatives of the type mentioned such that 


(46) | f, B, g)| < &/2, 


(48) | Vies |, | Vio |; | |; | k’’/3. 
Then the system 


(49) vila, 8, g) vi (a, f, B, g) + > *) (i 1, 2, n), 
j,l=1 


has a solution Wi(a, f, 8, g) existing and uniquely determined in |f|, 
|g| <w’, where w’>0 is a number <w that may be determined with the aid of 
Q,k, k’, k’’, L, L’, L’’, L’”’ only.* pial, f, B, g), - , Viola, f, B, g) exist, are 
continuous and in absolute value <k’, and Wias(a, f, B, g), f, B, g) 
exist, are continuous and in absolute value <max (k’’, 12n°LL’*k’?). 


We start the proof by increasing, if necessary, k’’ so as to satisfy the in- 
equality 
< 


We use successive approximations: 


1 = a, ,B, 
vila, f, B, g) (a, f, B, g) + > 


k,l=1 y° 


and generally for m=0 


(49. 1) (a, B, g) y? (a, 8, g) + 


k,l=1 


* In particular w’ does not depend on the function f(a) used in the definition of J; 2. 


f 

is 


456 HANS LEWY [May 


Denoting, as before, by C a constant depending only on Q, k, k’, k’’, L, L’, 
L’’, by 8 a generic first derivative with respect to a, f, 8, or g, and by 0? a 
generic derivative of type 0°/da08, 0?/dadg, 0?/dBdf, d*/dfdg, we set, for 


m>O, 
= 
m = — Ti; 
( y™ . yr! | 


+ max | Tiny") — | 


i,j,l=1 9 
a, f, B, ‘) a, f,B, | 
v 2 max ( ( yn l 


where the maxima are to be taken in |a|, |f|, |8|, |g| Sw’ with w’<@ to 
be determined later. On putting 


y° 
we find, by (43), 
(50) Uy S Cw’, 
(51) v9 < + Cw’, 
and, for vu», and v, the recursion formulas, in view of (44.1) and (44.2), 
(52.1) Um+1 S Cw'(um + Ym), 
(52.2) S Culm + Cw'tm, 


provided, however, that we can choose w’>0 so as to make sure the existence 
of all successive approximations | a] , | f| , |8|, |g] <’ in the common domain 
D’ (jel, |B], |g| <@’). Now determine w’ <Q so small that 


1— Cw’ > 0, (1 — Cw’)? — Cw’ > 0, 
Cw’(1 — Cw’) + (4n®LL’*k’? + Cw’)Co’ min (, ’) 
(1 — Cw’)? — 2 
Crw’ + (1 — Cw’) (4n5LL’2k’? + Cw’) 2k” 
(1 — Cw’)? — 


Note that 
i+ U+V)Co’ U 


U= 
(53) 


1938] GENERALIZED INTEGRALS 
and 
+ Cw’ + CU + S V. 
Now the conditions (46), (47), and (48) permit the construction of 


‘) 


in D, hence a fortiori in D’ (||, |f|, |8|, |g] Sw’), and we certainly have, 
by (53), 

u U, 

Vo < V. 


Suppose that we could construct, throughout D’, the mth approximation 
¥i"(a, f, B, g) and that 


(54) U, 


(55) 
We are then able to prove that we can construct the (m+1)st approximation 
and that 

m+1 m+1 

0 0 
In fact, we have, in view of (54) and (55), (46), (47), and (48), 
| f, 8, S| f, B, + 
0 


| f, B, g) | +Usk, 


(56) 


(57) | f, B, g)| <| + Dou; 
0 


(58) | f, B, g)| < k”, 


and by (52), 


m+1 m 


ui + Col + U+V)<U, 
0 0 


0 
m+1 


S w+ CD uj + Co! 0; + Co! + CU + Co'V Vz 
0 0 0 


4 
F 
. 
0 
fe 
if 


458 HANS LEWY [May 


Thus (54) and (55) hold for all m=0, and we conclude the uniform conver- 
gence of f, B, g), to limit functions y;(a, f, 8, g) and their 
corresponding derivatives dy;, 0°~;. The continuity relation (45) finally 


proves 
B, *) B, 


Hence by passage to the limit in (49.1) we obtain (49). 
The uniqueness follows similarly from the relations analogous to (44.1) 
and (44.2), 


(59.1) u Cu'(u + 2) 
(59.2) vs Co's, 


which yield u(1 — Cw’) S Cw’v S Cw’-Cu/(1 — Cw’), u((1 — Cw’)? — C*w’) < 0, 
u=0, v =0, with 


“= > max ‘) = 


i,7,l=1 
+ | — |, 
max | — Tin’) |, 


Vila, f, B, g) and ¥/ (a, f, B, g) being solutions of (49). 


1. If, in Theorem 4, f(a), f, B, f, B, g), 
OY? (a, B, 8), Oy? (a, f, B, 8), (a, f, B, 8), (a, f, B, 
and Wn) and its derivatives up to the third order depend on a parame- 
ter and converge uniformly as then f, B, g),-- , Unla, f, B, g), 
dr, , Wn, , converge uniformly, as in |f|, 
|g| <w’’ Sw’, where w'’ depends only on Q, k, k’, L, L’, L"’. 

Denote by A the operation of taking the difference for two sufficiently 
large values of uw, and put, in the successive approximations of the proof of 
Theorem 4, 


n 


tim = >. | Avz"(a, f, B, g)| + Ady" |, 


i=1 


2, | B, g) |. 


Observing that f(a) enters in the functional J only as a limit of integra- 
tion, as has been remarked earlier, we may use (45) and find, with some 


1938] GENERALIZED INTEGRALS 459 


C=C(Q, k, k’, k’’, L, L’, L’’, L’’’) and a new and smaller value w’’ of w’, 
satisfying (53) with the new C: 

Um C(tm—1 + + €)w”’ + Ce + uo, 

Vm Culms + +Ce+ Vo, 

Uo < €, Vo = €. 
Hence, for m—> ©, lim u, =, lim 7, =v 

us Ciutovt ew” +Cet+ um, 

Cut+ Cw” + Ce+ wm, 
u(i — S (C + + Cw” + 
C ” C 1 
1 — 1 — Cw” 
Hence uand vare <C’e, with C’(C, w'’), which proves the corollary . 
Coroiiary 2. If, in Theorem 4, f(a), Wo (a, f, B, g), Cin depend on a parame- 
ter 4, and if f(a) converges uniformly as w—po, then there exists a subsequence 
of u, such that yi(a, f, B, g) and also W:(a, f(a), B, g(8)) converge uniformly. 
For by (57), | dvi(a, f, B, g)| Sk’ and hence the y,(a, f, 8, g) are equicon- 

tinuous and bounded, in view of (56). Hence there exists a uniformly conver- 
gent subsequence, and the corollary follows. From (33) we conclude under the 
ene of the theorem, that 


dy ; 


, f(a), B, g(8)) = ° (a, f(a), 16, 6(8)) + — 


and, 
dm = 


In the application we intend to make of Theorem 4 and its corollaries, 
the values of the constants such as Q, k, k’, k’’, L, L’, L’’, L’”’ are of no im- 
’ portance. What matters, however, is their existence and their interdepend- 
ence. Therefore, we are led to use the following terminology: we call a func- 
tion bounded if its absolute value is bounded by a positive number irrespec- 
tive of the values of its arguments and possible other parameters; we call, 
in a theorem, a quantity relatively bounded if its absolute value can be 
bounded by a positive number which depends only on other bounds pre- 
viously introduced in the theorem; and we use the same term, in a proof, 
as meaning limitable by bounds, either assumed by the hypotheses of the 
theorem, or previously introduced in the course of the same proof. 


i 
A 
+ 
60 : 


460 HANS LEWY [May 


Thus Theorem 4 and Corollary 2 may be formulated as follows: 


THEOREM 5. Suppose ci: and its derivatives up to the third order with res pect 
to its arguments Wi, - - - , ¥n bounded for bounded values of Y;, and assume that 
(a, f,B,g),i=1, --- ,n, has derivatives Op? , which are continuous 
and bounded when a, f, B, g are bounded. Then the system 


B, g) (a, B, g) + (i = 1,2,--: ? n), 
j,l=1 

has a solution Wi(a, f, B, g), continuous together with dp; and d%;. This solu- 
tion exists, is uniquely determined and is relatively bounded together with the 
derivatives Op;, O°; for relatively bounded a, f, B, g. If, in addition, f(a), 
V2 (a, f, B, g) and ci; and its derivatives up to the third order depend on a pa- 
rameter w, and if f(a) converges uniformly as u— yo, then there exists a subse- 
quence of u such that the corresponding functions Y;(a, f, B, g) and Y.(a, f(a), 
B, g(8)) converge uniformly for relatively bounded values of a, f, B, g. 


4. Hyperbolic systems. The results of the preceding section may be used 
for a study of Cauchy’s problem* for the system 
= (a, B) 


0a 


ti=m+1,---,n, 
j=l 0p 
in which a;; and its partial derivatives up to the fourth order as well as the 
reciprocal value of the determinant | a,;| are bounded for bounded values of 
oi, - - - , On. The initial line is a bounded neighborhood of the origin on the 
line and on it the unknown functions ¢;(a, 8), - - - , dn(a, 8) assume 
relatively bounded values -- - , which are continuously differ- 
entiable. 
In view of the applications we subject the £:(a), - - - , £,(a) to the follow- 
ing condition: 
Condition #. ---, depend continuously on a, and there 
exists a transformation 


0, p= 1,2,---,m<n, 
(61) 


ween 


* A study of this Cauchy problem with a view to enlarging the class of admissible initial condi- 
tions was undertaken by Margaret Gurney in her dissertation, Brown University, 1935 (unpub- 


lished). 


1938] GENERALIZED INTEGRALS 461 


with constant y;; of determinant +1 such that the derivatives of {2(a), 
&s(a), - - - , &n(a) are bounded. 

Then there exists a relatively bounded solution ¢:(a, 8), - - - , dn(a, B) 
of (61) in a relatively bounded (a, 8)-neighborhood of the origin, assuming 
the given initial values, continuous in a, 8, and continuously differentiable 
with respect to a and £. 

It should be noticed that the essential content of the above statement lies 
in the fact that the derivative of £:(a) has no influence on the determination 
of the domain of existence. 

The idea of the proof is to construct instead of functions ¢,(a, 8) other 
functions y; of four arguments a, f, 8, g which reduce to the solution of the 
initial problem in question for f={:(a), g={1(8). In order to conform with the 
terminology formerly introduced, we henceforth shall identify ¢:(@) with f(a). 

We try to satisfy the following conditions for functions y/;,°(a, f, B, g): 


aye 
dadp 
VP (a, fla), a, f(a)) = &(a), 


(i) (a, f(a), B, g(8)) = 0, 


viata, fla), a, f(a)) 


df(a 
+ Vis(@, f(a), a, f(a)) ‘| 0, is m, 


| fa, a, fle) 
+ Viola, f(a), a, f(a)) | =0, i>m. 


df(a) 
da 
We first introduce y? (a, f, a, f) by 


(62) (a, f, a, f) = + 
k=2 
which yields 
(63) Visa, f, a, f) + viola, f, Qa, f) =Ta. 
To determine y?, and yf, we set up the system 
I, a, NWisla, a, f) = 0, 
k=l 
(4) 
aix(y'(a, Qa, Qa, f) 0, 


k=1 


} 
4 
(ii) 
i=1,2,---,n, 
$= 1,---,m, i 
i=m+1,---,n, 


462 HANS LEWY [May 


which together with (63), in view of the boundedness of |a,|~!, deter- 
mines Wi,(a, f, a, f) and Wi? (a, f, a, f) as analytic functions of ai.(p"(a, f, a, f)) 
and thus as relatively bounded functions with relatively bounded and con- 
tinuous total derivatives with respect to a and f. 
From (ii) 
dg 


Viela, f, a, f) + Visle, fra, f) = d $= 1,2,---,#, 
k=2 a 


and by (64) and (iii) 


n 


(65) 
f(a), a, f(@)) = 0, i> m, 
which determine y, (a, f(a), a, f(a)) and Y%s(a, f(a), a, f(a)) as continuous 


and relatively bounded functions of a for relatively bounded a. 
We now put 


f 
Ada, f) = f 


Obviously, A,(a, f) has continuous and relatively bounded derivatives with 
respect to f and a. 
- Finally put 


V; (a, f, B, g) = vi (a, f, a, f) + A,(8, g) — A,(a, f) 
(66) “ f fla’), a, fla’)) — Aia(a’, f(a’) 
8B 


The reader will easily verify that the function V? (a, f, 8, g),as defined by (66), 
has the following properties: 


Vi (a, f, a, f) vi (a, f, a, 
f, f) = virla, f, a, f), 


dA (a, 
Vis(a, f, a, f) Visla, f, a, f), 


if df 
Vis(a, f(a), a, f(a)) = vis(a, f(a), a, f(a)), 
Viale, f(a), a, f(a)) = Wiala, f(a), «, f(a)). 


z 


1938] GENERALIZED INTEGRALS 463 


Hence we are justified in considering VY? (a, f, 8, g) as an extension of those 
elements of the unknown function y (a, f, 8, g) which were used in the con- 
struction of V2 (a, f, B, g), and we write V° (a, f, B, g) =? (a, f, B, g). We have, 
moreover, 0°¥ =0 so that (i) is true. From (62) we obtain (ii). Formulas 
(64) and (65) give (iii). 

Evidently, |¥(a, f, 8, g)| is less than an arbitrary positive number e if 
the bounds of the initial data £:(@), - - - , &.(a) are sufficiently small and if 
a, f, 8, g are relatively bounded. 

On differentiating the first m equations of (61) with respect to 8, and the 
last (n —m) equations with respect to a and solving with respect to the mixed 
derivatives, we obtain a system of the form 

B) 8) Agila, B) 
where the cij:(¢:, - - - , @n) have bounded derivatives up to the third order 
for bounded - , gn. Replacing ¢ by y, we solve 


Vila, B, g) (a, B, g) + > (“ ‘) 


j,l=1 


with the aid of Theorem 5. By (60) and (i) we have 
d*(a, f(@), B, g(8)) 
dadB 
f(a), B, g(8))) 


dy (a, f(a), B, g(8)) dypi(a, f(a), B, g(8)) 
da dp 


In view of (ii), 8B) f(a), B, g(B)) assumes the given initial values, 
satisfies (61) on a=8, and has continuous derivatives with respect to a and B 
and continuous mixed second derivatives with respect to a and £. 

A conclusion, familiar in the theory of hyperbolic equations shows that 
equations (61) are satisfied identically in a and B. 

Thus we have established the following theorem: 


THEOREM 6. If in (61) a;; and its partial derivatives up to the fourth order 
and the reciprocal value of the determinant | a;;| are bounded for bounded values 
of di, On, and if the initial values of o:(a, 8B) on a=8 are relatively 
bounded in a bounded neighborhood of «=0 (=) and satisfy condition 3, then 
Cauchy’s problem has a solution existing for all relatively bounded a, B. This 
solution has continuous derivatives with respect to a and B. If the initial values 
and the a;; depend on a parameter u and converge uniformly as u—po, then there 
exists a subsequence of u, for which the corresponding solutions $;(a, B) converge 
uniformly. 


| 

4 


464 HANS LEWY 


Coro.iary 1. The solution of Theorem 6 is unique.* 


Let |a| <A, |@| <B be the common domain D of existence of two solu- 
tions of our initial problem. Denote by wu, v, w(r) 


u(r) = > max | Agi|, o(r) = 


i=1 i 
09; | 
wT) = max | A — |, 
©) op 
where the operator A indicates the difference of the expression following A for 
the two solutions, and the maximum is to be taken on that segment of the line 


7 =|a—£| which is contained in D. By (67) we have for a suitable constant K 


u(r) @+ w)| u(0) = 0, 
0 


SK (tot 2(0) = 0, 


w(r) Kf dl, w(0) = 0. 


w)| 
0 


and by the well known iteration u=v=w=0. 

By reasoning very similar to the preceding it may be shown that the de- 
pendence of the initial data on a parameter such that the initial data of ¢, 
and those of 0¢;/da, 0¢;/08 satisfy a Lipschitz condition of exponent 1 in 
the parameter implies a Lipschitz condition of exponent 1 in the solution. 
Furthermore, passing to the limit from difference quotient to derivative with 
respect to the parameter we obtain the following corollary: 

Coroiiary 2. If, in Theorem 6, the initial data of $; and those of 0¢;/da, 
0¢;/08 are continuously differentiable with respect to a parameter, the solution 
and its first derivatives with respect to a and B are also continuously differentiable 
with respect to the parameter, continuity being understood with respect to the 
parameter and variables. 


* Cf. Hadamard, Lecons sur le Probléme de Cauchy, Paris, 1932, pp. 488-501. 


THe UNIVERSITY OF CALIFORNIA, 
BERKELEY, CALIF. 


max | A —|, 
Hence 


DECOMPOSITIONS AND DIMENSION OF CLOSED 
SETS IN R™ 


BY 
ARTHUR N. MILGRAM 


1. Introduction. It is our purpose to give a characterization of the dimen- 
sion of closed sets immersed in R" (euclidean space of m dimensions). This is 
done in terms of certain properties of the decompositions of these sets into a 
countable infinity of closed sets. The results are well known for finite decom- 
positions of compact sets, but have never been shown to be so intrinsic a 
property of dimension as to remain valid under countable decompositions. 
This may be due to the fact that the proof seems to require much of the tech- 
nique and many of the results of Alexandroff,} which are of very recent de- 
velopment. We may state our principal result as follows: A closed subset F 
of R” is of dimension r if and only if there exists an e>0, such that F may be 
decomposed into the sum of a countable infinity of closed sets Fi, F2,---,Fs,-- - 
of diameter less than ¢, for which dim F;-F ;Sr—1, ij, but for any such decom- 
position there exists a pair of integers m and n such that dim F,,-F,=r—1. 

This result follows quite readily when we have proved the following: If F 
is a closed subset of R”, pa point of F, Fi, F2,- -- , Fs, adecomposition 
of F into closed sets, 2*~"-! a cycle in S(p, e) —F, which does not bound in 
S(p, €) —F but does bound in S(p, «) —F;,i=1, 2, - - - , then there exists a pair 
of integers m and u such that dim F,,-F,,-S(p, €) 2r—1. From this we obtain 
an interesting result which may be considered a generalization of a theorem 
due to Miss Mullikin.t We show that the sum of a countable number of 
closed sets, no one of which separates R", and the dimension of whose inter- 
sections taken pairwise does not exceed n—3, cannot separate R”. 

The author takes this opportunity to express his gratitude to Professor 
J. R. Kline, whose suggestions and unfailing encouragement made this paper 
possible. 

2. Notation. The notation and definitions used in the sequel are widely 
employed. For example, 5(M) refers to the diameter of a point set M, 
p(M,, M:) to the distance between the sets M, and M2, S(M, e) to the set of 
points x such that p(M, x) <e. We shall denote the boundary of a point set 
M by B(M). 

* Presented to the Society, March 27, 1937; received by the editors April 4, 1937. 

t Especially his article Dimensionstheorie, Mathematische Annalen, vol. 106 (1932), pp. 161-238. 


t A. M. Mullikin, Certain theorems relating to plane connected point sets, these Transactions, vol. 
24 (1922), pp. 144-162. 


465 


i 

{ 

4 

= 

. 


466 A. N. MILGRAM [May 


The superscript attached to a symbol representing a chain will denote the 
dimension of the chain. The complex composed of the simplices of a chain C‘ 
will be denoted by |C‘|. Cycles and chains occurring in this article will be 
assumed to have integral coefficients, although most of the results are just 
as valid if chains modulo m (m=0) or rational chains are used. The relation 
expressing the fact that the cycle z is (is not) the boundary of a chain in the 
domain D will be written in D non-20 in D). 

3. Dimension of simplices. Some years ago Sierpifiski* proved that no 
continuum M can be expressed as the sum of a countable number of closed 
and proper subsets of M whose intersections are mutually vacuous. Remem- 
bering that the vacuous set is of dimension —1 and applying this theorem to 
the one-simplex, we have as an immediate result: 


Coro.iary. No one-simplex can be decomposed into the sum of a countable 
number of closed sets of diameter less than €>0, where is less than the diameter of 
the simplex, and the dimension of the intersection of any pair of these closed sets 
is Minus one. 

When the theorem is stated in this form, one is led to anticipate the more 
general statement: 


THEOREM 1. No r-simplex T’ can be decomposed into the sum of a countable 
number of closed sets of diameter less than «>0, where ¢ is less than the diameter 


of T’ and the dimension of the intersection of any pair of these closed sets is at 
most r—2. 


Proof. Suppose the theorem false. Letting Fi, F2,---,Fs,--- denote 
the closed sets referred to in the statement of the theorem, we have 


and 
dim F;-F;Sr—2, 6(F) <e, i¥j 


Denote the totality of intersections F;-F; by P, Pz, - - - and their sum 
by Pi is closed. 

Since the sum of a countable infinity of closed sets of dimension at most 
r—2 is of dimension at most r—2, it follows that P cannot fill any domaint 
in 7’.t But in the complement of each F;, since F; is of diameter less than 


* W. Sierpifiski, Un théoréme sur les continus, Tohoku Mathematical Journal, vol. 13 (1918), pp. 
300-304. 

t A domain is a connected open set. See P. Alexandroff and H. Hopf, Topologie, vol. 1, p. 51. 

t P. Urysohn, Mémoire sur les multipiicités Cantoriennes, Fundamenta Mathematicae, vol. 8 
(1927), pp. 337-341. 


1938] CLOSED SETS IN R" 467 


5(7”), there exists a domain. From this it follows readily that at least two 
of the sets F;—P (i=1, 2, - - - ) are non-vacuous. 

Let p bea point of F;,—P, and ga point of F;,—P, i:%%2. We may assume 
p and gq to be interior points of J’. Now P, is a closed set, and dim P;<r—2. 
Consequently p can be joined to g by a polygonal line 4 in the interior of 
T’ —P;.* We enclose é, in a domain D, whose closure does not meet P. 

Suppose we have constructed the domains D,, D2, - - - , Di-1, where 

1. D, c Dy-1, 

D,-P.=0, and 

3. Dir pt+gq (k=1, 2,--- , t-1). 
In the construction of D; we observe that dim P;-D;.<r—2. Hence p and g 
can be joined by a polygonal line ¢; lying in D;1—P;- Dj, which we then 
enclose in a domain D; whose closure is contained in D;-, and does not 
meet P;. 

We thus obtain the sequence of continua 


(a) D,, 


where (a) satisfy relations 1, 2, and 3 above. Il=D,- Dz 
continuum containing » and g, and, as is easily seen, containing no point 
of P. But the decomposition of II into the closed sets IIl-F; (¢=1, 2,-- -), 


it=1 


affords a contradiction to Sierpifski’s theorem, since at least two of these sets 
(II-F;, and II-F;,) are non-vacuous, whereas the intersection of any pair is 
vacuous. This contradiction establishes the theorem. 

By a slight modification in the method of constructing the domains D; 
(that is, by constructing D; as a chain of regions whose diameters are less 
than 1/2), we could have been taken II to be an arc.¢ We should thus have 
obtained an incidental proof of the following theorem: 


THEOREM 2. The complement in an n-dimensional simplex (or in R") of the 
sum of a countable infinity of closed sets of dimension at most n—2 is arcwise 
connected. 


Although, as observed in Theorem 1, an r-simplex cannot be decomposed 
into small closed sets with mutual intersections of dimension at most r—2, 


* Cf. P. Urysohn, loc. cit., p. 307. 

+ For a discussion of this method of characterizing an arc, see R. L. Moore, On the foundations 
of plane analysis situs, these Transactions, vol. 17 (1916), pp. 133-139. 

} This was first proved for the case n=2, i.e., for the plane, by J. R. Kline, Concerning the com- 
plement of a countable infinity of point sets of a certain type, Bulletin of the American Mathematical 
Society, vol. 23 (1916-1917), pp. 290-292. 


Ml 
fi 


468 A. N. MILGRAM [May 


it can always be decomposed into a countable number of closed sets, of arbi- 
trarily small diameter, whose intersections taken pairwise are of dimension 
at most r—1 (for example, by a simplicial subdivision). We may therefore 
characterize the dimension of a simplex in the following way: 


THEOREM. A simplex T is of dimension r if and only if for any €, where 
5(T) >«>0, T may be decomposed into the sum of a countable infinity of closed 
sets F\, Fo,---,F,.,-- of diameter less than ¢, for which dim F;-F;Sr—1, 
ij, but for any such decomposition there exists a pair of integers m and n such 
that dim F,,-F,=r—1. 

This, as well as Theorem 1, is a special case of more general considerations 
to be developed independently in a following section. Its chief interest lies 
in the simple proof based entirely on set-theoretic considerations. The same, 
or similar, methods do not seem adequate for a treatment of closed sets in R”. 

4. Some preliminary lemmas and considerations. After this simple char- 
acterization of the dimension of a simplex, it is quite natural to define the 
dimension of a closed set F in an analogous fashion. The definition is an in- 
ductive one, where the vacuous set is defined to be of dimension minus one. 


DerrniTion. A closed set F is said to be of dimension r if there exists an 
¢>0, such that F maybe decomposed into the sum of a countable infinity 
of closed sets Fi, F2,---, F.,- ++ of diameter less than ¢ for which 
dim F;-F;<r—1, 147, but for any such decomposition there exists a pair of 
integers m and n such that dim F,,-F,=r—1. 


It is our aim in the sequel to show the complete equivalence between this 
definition of dimension and the Menger-Urysohn definition applied to closed 
sets. To accomplish this we prove several lemmas and theorems. The first 
of these, Lemma A,, is based on the notion of e-modification and simple r-di- 
mensional obstruction introduced by Alexandroff in his article Dimensions- 
theorie, previously referred to, and certain methods and theorems proved 
there. For the sake of completeness we shall define ¢-modification and simple 
r-dimensional obstruction and state the results used in this paper. 


DeFinition. Given a chain K and a positive number e, the chain K’ will 
be called an ¢-modification of K if to each simplex x of K there corresponds a 
chain y in K’ satisfying the following conditions: 

a. If then cy). 

b. If h=0 (that is, if x/ is a vertex), then x? =y?. 

c. The sum |/| +|y/| is contained in a sphere of radius e. 

d. K=)0a,x; implies where the y,’s are the chains correspond- 
ing to the simplices x;. 


1938] CLOSED SETS IN R" 469 


Remark. The e-modification will be called simple if the bounding relations 
are taken modulo m, m= 0, and only integral coefficients appear. But if the y/ 
are chains with rational coefficients, then this is called an e-modification 
modulo 0. 

From the definition of e-modification it is a short step to prove that if K’ 
is an e-modification of K, then for every simplicial transformation f, of K’ 
into K, where each vertex of y* goes into some vertex of x, 


(1) f(K')=K. 


DerFIniTi0n. F c R", and x isa point of F. F is said to be a simple r-dimen- 
sional obstruction in the neighborhood of x if there exists an e>0 so that for 
every 6 sufficiently small S(x, 6)—F contains an »—r—1 dimensional cycle 
modulo 0, which does not bound in S(x, ¢) —F. 


We can now state Alexandroff’s theorem: 


THEOREM. The set F is r-dimensional in the sense of Menger-Urysohn if and 
only if F is a simple r-dimensional obstruction in the neighborhood of at least 
one point, but forms no simple k-dimensional obstruction in the neighborhood of 
any point, if k>r. 


We turn now to the proof of several preliminary lemmas. 


Lemma A,. Suppose 
. Dadomain in R", 
. F a set closed relative to D, 
. the dimension* of F at most equal to r, 
. 2° a cycle in D—F, 

5. K*-*-! a chain bounded by 2"-*- in D, 
are given. Under these conditions, if €, is positive and less than p(K*~"-!, B(D)), 
there exists in «)—F a chain bounded by such that 
in D. 

Proof. Denote the set F-S(K*-*-!, «) by F’. F’ is a compact subset of R” 
of dimension at most r (conditions 2 and 3). From the theorem of Alexandroff 
just quoted, F’ cannot form an (r+7)-dimensional obstruction in the neigh- 
borhood of any point. Hence given an e;>0, we can find an €;41>0, €:41<«, 
such that if is a cycle in R" —F’, and <€i4:, then there ex- 
ists a chain C,"~’~‘ satisfying the conditions: 

(a) (in R" — F’), 
(b) < fe, (i= 1,2,---,m—r—1). 


* Here, and in the sequel, whenever the term dimension is used it is to be understood in the sense 
of Menger-Urysohn. 


a 


470 A. N. MILGRAM [May 


Now given «,, we find successively the numbers é, €3, - - - , €n-r, Subject to 
these conditions. 

Let | K"-’-!| be subdivided into simplices of diameter less than €,_,. We 
may assume that none of the vertices of the subdivided complex lie on F, for 
by an arbitrarily small displacement, leaving 2"-"~? intact, the complex can 
be made to satisfy this condition. We denote the chain obtained from the sub- 
division of | K*-*-!| by (K"-'-1)’, wherein the orientation of the simplices 
will be that induced by their carriers in K"~*—!, and their coefficients will be 
the same as those of their carriers. 

Consider any one-simplex of (K"-*-')’, say Its boundary is a zero- 
cycle of R,—F’ whose diameter is less than €,_,. We can find a chain y,;! in 
R" —F’ bounded by “; and such that 6(y,!) <}¢€,_,1, from the restrictions 
ON €2, €3, , €n-r- Lhe vertices of are a subset of the vertices of 

Suppose now that the chains y;* have been constructed and ordered to the 
simplices x; in such a way as to preserve incidence relations. Assume more- 
over that 

5(y#) 

2°. yi c R"—F’, 

3°. every vertex of x)! is a vertex belonging to y,'. 

If x is a simplex of and is its boundary, it follows from 
1° and 3° that 


< 


Also, from 2° and the preservation of incidence under the ordering, >-c‘y; is 
a cycle in R”—F’. Hence from the conditions on the ¢,, (s=1,2,---,n—r), 
and from (a) and (b), there exists a chain 


yitio ciy in R"—F’, 


such that 
5(yit!) < 


Each vertex of x;'+! is a vertex of y,'+! (from 3°). We are careful throughout 
the above process to choose the y;'’s corresponding to simplices of (2"-*-?)’ 
as the simplices themselves. This may be done since (z"-"-*)’ is contained in 
R" —F’. Continuing this process we arrive at an €-modification of the chain 
(K*-’-1)’ which we may denote by 

If 


= > 


= 


then 


1938] CLOSED SETS IN R" 


and from the construction, we have 


Since Cz-"-! ¢ S(K"~-1, e:), we can replace F’ by F in the preceding relation. 
There exists a chain CR3’~! such that 


(s*-*-*)’ in | gn—r-2 | 


(C%37-! may be obtained by the so-called cylinder construction,t on 2"-"-, 
in which the base is subdivided into isomorphism with (z"~-*)’ and the 
vertical lines are collapsed into points.) 

The chain 


satisfies the statement of the lemma. C*~*-' is bounded by 2"~"-?, and we must 
show that C"-"-!— K*-*-!=0. We do this in two steps: 

(c) — in S(K*~"', a). 

(d) Cay + (Ke) — Ker in | 


Perform a simplicial transformation of CZ-’-! into (K"-*-!)’ in such a way 
that the vertices of y,' are transformed into the vertices of xf ({=0,1,---, 
n—r—1). Then 


= 


follows from the discussion about relation (1). If x is a point of Cg-"-', then 
p(x, f(x))<«. Therefore the straight line xf(x){ lies in S(K"~-", «). Let 
G"~-! be a chain isomorphic to Cz-*-!. Denote the cylinder formed by the 
product of the interval J(0<t<1) and |Gg-’-"| by |G*-*|, and subdivide 
and orient |G"-*| so that 


= x 0 1 Zr x I, 


where 2"~’—? is isomorphic with (2"-"-?)’. We now cnstruct a continuous trans- 
formation g such that every point x of G*-*-!X0 corresponds to its image y 
under the isomorphism between X0 and while x X1 goes into 
f(y), and x Xt, where 0 </<1, goes into the point dividing the line yf(y) in the 
ratio ¢:(1—é). Then 


a1), 


t For a discussion of this method see P. Alexandroff and H. Hopf, Topologie, vol. 1, pp. 196-198. 
t Here xf(x) means the straight line from x to f(x). 


471 


472 A. N. MILGRAM 


and 


gG-"y = g(G"-") = — (KI)! + X I), 


But g(Z"-*-? =0, since g sends the (n—r—1)-chain Z*-"-?XJ into the 
(n—r —2)-chain (2"-*-*)’. This establishes relation (c). 

To establish (d), it is sufficient to note that the left-hand member is the 
boundary of the cylinder on K*-*-! with base subdivided into (K*-*-')’ and 
vertical lines degenerated into points. Adding (c) and (d), and using the defi- 
nition of C"-*-!, we have 


This completes the proof of the lemma. 


Lemma B. Assume 
. Kacomplex, 
. Faclosed subset of K, 
3. 2° an r-cycle on K which does not bound in K —F, 
. Kg* (i=1, 2) two chains of K bounded by z’, 
5. a chain bounded by the cycle 
Then the intersection I of F and K*** contains a continuum M joining K,'++ 
and 


Proof. Suppose J contains no continuum M joining K,’+ and K,"+". Then 
there exists an «<0, such that whenever K'+? is subdivided into simplices 
of diameter less than e, and H**? is the collection of those simplices having a 
non-vacuous intersection with J, no component of H’+? joins K,'t! and 
K,"*!, Assume the contrary. There exists a sequence of simplicial subdivisions 
of say - - - , K,’+®, - - - where the simplices of K,"+? are of 
diameter less than ¢;>0, lim;...¢;=0, and the collection H;’+?, of simplices of 
K,+? meeting J, contains a component joining K,’+ and K,"*+". Denote these 
components by Ly, Ls, - - - , Li, - - - . We can choose a subsequence of the L; 
for which the limit inferiort is non-vacuous. Since L;¢ H+? ¢ S(J, ¢;), the 
limit superior of our subsequence is a subset of J. From a well known theorem 
of L. Zoretti,§ the limit superior is a continuum M. Moreover, since the L; 


¢ This lemma was stated and proved first by R. L. Wilder, Generalized closed manifolds, Annals 
of Mathematics, vol. 35 (1934), p. 879, for the case K=R” and F=I"-", a cycle linking 2". Ina 
footnote on the same page Wilder mentions the truth of the lemma when K = R” and F is any closed 
set. This latter statement would have been sufficient for our needs. However the formulation of 
Lemma C has intrinsic interest, and we offer it even though its full generality is not needed here. 

t For a definition of the terms “limit inferior” and “limit superior” see Kuratowski, Topologie I, 
p. 152. 

§ See, e.g., Hausdorff, Mengenlehre, 1927 edition, p. 163. 


1938] CLOSED SETS IN R" 473 


join K,"+! and K.’t!, M does likewise, contrary to our assumption of the 
falsity of the lemma. 

Now let ¢ and H’*? be defined as in the preceding paragraph. Let ‘C’+? 
denote the sum of the components of H*+? which meet K,’t+!. We form the 
following chain from the simplices of C’+?: If x”+? is a simplex of ‘C’+?, then 
x'+? is assumed to have the same orientation and coefficient as its carrier in 
Kr+*®, We denote this chain by (C*t?)’. Similarly, form the chains (K*+?)’, 
(K,"*)’, (K2"+")’, the subdivisions of the chains K*+?, K,+!, If is 
the boundary of (C*+*)’, we have 

(Cr+2)! 
and 
(Krt*)! — (Ky)! — 

L’* contains points of J only in those simplices which are contained in 
For (Cr+*)’ does not meet (K,.’+")’, and if x*+ is a simplex of 
containing a point of J, then all simplices having x’+! on their boundary be- 
long to (C’+*)’. Hence x’+! can only occur with coefficient different from zero 
when it belongs to either (K,’+")’ or (K2’+")’. Those simplices of L*+! which 
meet I belong to (K,’+')’ and have the same coefficient as they have in 
(K,'*?)’. It follows that is a chain in | K*+?| But 

L130 
and 
(Kir) (ery. 
Hence 
(Ky)! — Lrtt 
in | Kr+?| —J, and therefore in K —F. Since 2"&(z")’ in |z"|, we have 2” does 
bound in K —F contrary to the hypothesis of the lemma. This final contradic- 
tion completes the proof. 

Let F be a subset of the domain Dc R". F is closed relative to D and de- 
composed into the sets Fi, F2, - - - , Fi, - - - , each closed relative to D. De- 
note the totality of products F;-F;, ij, by Pi, Ps, ---, Pi, - - - , and their 
sum by Let H;x=F;—P (i=1, 2, -- -). 

Lemna C. If infinitely many of the sets H; are non-vacuous, then infinitely 
many of them contain points in the complement of S, the limit superior of 


t It would be sufficient for our purposes to show that at least two of the sets H’ contain points 
in the complement of S, but the proof is the same in either case. 


474 A. N. MILGRAM [May 


Proof. Suppose there were a number s such that S>)>72,H;. Let p; be a 
point of the first non-vacuous set, say H,,. 


F;-p~pi = 0 — 1). 


For if Fy i’<s, then Fy-F,,=P.> pi, and ¢ pr 
-(F,,—P,)=0. Consequently F—(Fi+F.+ --- As FitFs 
+--+ +F,,-: is closed relative to D, there is a neighborhood N, of pf; such 
that Nic D—(Fit+Fot+ - - - +F,,-1). 

Since S>,, N; contains a point p2.¢ H,,, where s2>5s;. In the same way 
we find a neighborhood such that Ne-(Fi+Fe+ - - - =0. 
Continuing this process, we obtain a sequence of integers 5: <s2< - - , s;27%, 
and neighborhoods Ne, ---, Ni, such that 


Ni, 
and 
Wi-(Fi + Fat +Fe-1) = 0. 
Since WV; is compact, []/.,W;+0. If x is a point of [2 ,;, there is an in- 
teger ¢ such that F,1>-x. But x is also contained in N,, while V,-(Fi+Fs 


+--+ +F,,1)=0. This contradiction establishes the lemma. 
5. Principal theorems. Let F be a closed subset of R", p a point of F, and 


€ a positive number. F;, Fo, - - - , Fs, - - - isa decomposition of F into closed 
sets. 


THEOREM 3. If there exists in S(p, €)—F a cycle 2"-"-', which does not 
bound in S(p, €) —F but does bound in S(p, €) —F;,i=1, 2, - - - , then there is 
a pair of integers m and n, m#n, such that, 


dim Fn-Fx-S(p, — 1. 


Proof. If we assume the existence of the cycle z"~*-!, F contains a closed 
subset A which is irreducible with respect to the property 


(2) z™-"!non-~0 in —A. 


This may be seen by first showing that when F3F’3F"’3-.-- 3FM>..-.- 
are closed sets and 0 in S(p, —F™, then F,=[]7_,F™ is 
likewise closed and z"~"-!non-& 0 in S(p, e)—F.. Then, from a well known 
theorem due to Brouwer, the existence of A follows. To prove the first point 
it is sufficient to remark that if K"~" is any chain in S(p, €) bounded by 2"-"-", 
then K"~’ has a compact and non-vacuous intersection with each of the sets 
F(‘), The product of these intersections is non-vacuous and belongs to F.. 


Tt See K. Menger, Dimensionstheorie, Leipzig and Berlin, 1928, p. 69. 


1938] CLOSED SETS IN R" 


Let A;=F;-A. A; is closed, and 


(3) A= 

i=1 
From F;> A; follows S(p, €) —A:>S(p, —F;. By hypothesis, bounds 
in S(p, —F;. Hence 


(4) in S(p,6) — Ai 
Also A;-A;-S(p, €) ¢F;-F;-S(p, €), which gives 
dim A;:A;-S(p, €) < dim F;-F;-S(p, €). 


From this point to the completion of the proof, we shall suppose the theo- 
rem false, that is, dim F;-F;-S(p, €) Sr—2, for all i, 7, (i7). Consequently 


dim A;-A;-S(p, €) sr-2, for 


Arranging the countable collection of products A;-A;-S(p, €) in a sequence, 
and renaming them P,, Ps, - - - , we have P=)... ,P; is of dimension at most 
r—2. This is a consequence of the well known result that the sum of a count- 
able number of sets, closed relative to a domain and each of dimension at 
most r—2, is itself of dimension at most r—2. We have thus constructed a 
set A and a subdivision A, As, - - - which bear the same relation to 2"-"-! 
and S(p, €) as do F and its subdivision and in addition is irreducible with 
respect to the non-bounding of 2"-*-! in S(p, e) —A. 
We now prove that if 


AW) = DA, 


i=1 


then 
(5) in S(p,¢) — A(d), 


This is demonstrated for the case =2. A simple induction then carries the 
demonstration to any finite ¢. For, if by assumption, 2*-*-! bounds in 
S(p, €-) —A(t—1), then A(#—1) and A; satisfy the same conditions as do A, 
and Ae, and A(t—1)+A,=A(é). (A(t—1) is closed since it is the sum of a 
finite number of closed sets. Moreover, A(t—1)-A,-S(p, €-)=A1-A:z-S(p, ©) 
+A2-AzS(p, =) €), being the sum of a finite number 
of sets closed relative to S(p, €), and each of dimension at most r —2, satisfies 
the relation dim A(#—1)-A,-S(p, €)Sr—2.) We turn to the proof for ¢=2. 

We can find chains C;"~" in S(p, €-) —A;, 7=1, 2, (relation (4)) such that 


in S(p, — A;, j = 1,2. 


475 


476 A. N. MILGRAM [May 


Denote the cycle C,"-"—C2"~" by Z"-". Let K*-*+! be a chain bounded by 
in S(p, €). €), S(p, €) satisfy the same condi- 
tions as do F, Z*-", K*-r+1, D, respectively, in Lemma A,_». Hence there 
exists a chain C*~*+! such that 


in S(p, — 


Now if 2"-’-! did not bound in S(p, —A(2), then A(2), 
C»-r+! would satisfy the same conditions as K, F, 2", K,+!, Kr+*, respec- 
tively, in Lemma B. Hence A(2)- | C*-’+4| would contain a continuum M join- 
ing C,"~" and C,"~". This is impossible. 


M = M-A,+ M-Az2. 


Neither of these sets is vacuous, since A2> M-|C,"-"| #0 and A, > 
~0. The intersection Az) -(Ay- As) =0. 
This negates the connectedness of M and therefore establishes relation (5). 

Infinitely many of the sets S(p, €)-A;—P are non-vacuous. Suppose to 
the contrary that only 


S(p, €):A1 — P, S(p,6)-A2 — , S(p,6€)-At — P 
are non-vacuous. Then. 
S(p, <)-A = S(p, €)-AW) + P. 
But there exists a chain C*~’ (relation (5)) such that 
2"! in S(p,¢) — A(é). 


We then find a neighborhood D of C*- which is small enough to lie in S(9, €) 
and to exclude A (#). Since A- D=P-D, it would follow that dim A - D<r—2. 
Again applying Lemma A;-2, 2*-*-' would bound in D—A-D and conse- 
quently in S(p, e)—A, contradicting relation (2). 

Now S(p, €-):A=)>..,5(p, €)-Ai, and infinitely many of the sets 
S(p, €)-A;—P are non-vacuous. Hence, by Lemma C, there exist two points 
pi and 2 belonging to different sets, say, Ai,—P and A;,—P, and such 
that neither ~; nor 2 belongs to the limit superior of S(p, ¢€)-Ai—P, 
S(p, €):-A2—P,---. We can find a number a sufficiently small so that 
S(px, a) will meet no set of S(p, €)-Ai1—P, S(p, €)-A2—P,-- + other than 
Ai,—P, (k=1, 2). The sets A —S(px, aw) (k=1, 2) are closed and proper sub- 
sets of A. It follows from the irreducibility of A that there exist chains C,."—" 
such that 

Ce-r— in S(p,6) — {A — S(p:, 


Choose a number ¢« such that e is smaller than either of the numbers 


1938] CLOSED SETS IN R" 477 


A—S(px, We now replace the chains C,"~" by chains 
lying in S(C,"-*, 4) — Pi, which are bounded by z"~*-'. This is of course pos- 
sible by Lemma A;-1. (P; is closed relative to S(C,"-", «) and of dimension 
less than r—1.) The cycle Ci;’ —C7;” lies in the complement of P; in S(, €) 
and hence bounds in S(p, ¢) —P:. This may be shown by allowing Ci7’ —Ciy’ 
to bound some chain in S(p, €), and then displacing this chain to another 
chain C,"-"+! bounded by Ci;’—Ciy’ in S(p, €) —P:1; a permissible operation 
as shown by Lemma A,-_». 
Let us suppose that we have constructed the following chains: 


Assume further that these chains satisfy the following conditions: 


(a) > ax > 0. 


(b) ax = P) > for s> i. 


(d) Cie in S(p,€) — {A — S(pi, a) }, =1, 2. 
di = B(S(p, €))) > d > 0. 


We proceed with the construction of C?7/*?. 
Choose a number 6 smaller than the minimum of the numbers 


an — i= i. 
b®. S(pr, a)), k=1,2, 
dy d. 


If is any chain bounded by in 6), then satisfies 
condition (d). To prove this we show that if x is a point of A —S(p;, a) and y 
a point of C77{,, then p(x, y) >0. Cf, is assumed to lie in S(C?z’, 5), so that 
there is a point z of C7,’ such that p(y, z) <6. Condition b® applied to 


ield p(x, y) = p(x, z) p(y, 
yields 


p(x, = , A — S(pe,a)) — 6 > 0. 
If Ci7{*? is a chain in S(C,"-"+', 6), C?77*? will satisfy (b), that is, 


For if x is a point of C?7/*', y a point of P;, there is a point z of C,"-"+! such 
that p(x, z) <6. From p(x, y) 2p(z, y) —p(x, z) and condition a® on the num- 


ber 5, we have 
n—r+1 


p(x, y) 2 


» Pi) > an — (an — = 


478 A. N. MILGRAM [May 


Similar considerations show that C?7/*' would satisfy (e). 
In 6) — (k=1, 2), we can find a chain C?7{,, and in S(Cfy’, 6) 
a chain *C*~*-! such that 


in Six 6) — Pus, 
7 n—r n—T 
(This is justified by Lemma A,_2.) The chain 


n—r+1 n—r+l1 i 2 n—r+1 


(8) Cui =Cy +C 


is such that 


Cita — in 8) 
by (7) and (c) applied to (8). Since the boundary of *C?7/*! lies in 
6) — by Lemma A,_» we can replace *C?7{*' by C?7{*', a chain 
in S(C,"-"+1, 5) — bounded by 
The set is closed relative to S(p, €), and C?7{*' is contained in €) 
and does not meet P,,,. Consequently we can find a number az41,41 for which 
the relation 


n—r+l1 


Peri) > > O 


holds. We have thus obtained an extension of the system (6) by the addition 
of C?z{**. Since we had previously shown that (6) existed for ¢=1, this lat- 
ter shows that (6) can be extended indefinitely. We may therefore suppose 
that we have constructed a countable infinity 

of chains satisfying relations (a)—(e) inclusive. 

Consider any chain of the above sequence. A-|C,"-r+1| is a closed set, 
and z"-’-! does not, of course, bound on | C,"-"+!| —A-|C*-r+!|. Applying 
Lemma B, we see that A-|C,"-"+"| contains a continuum M, joining C?;’ 
and C?}’. Since these latter chains meet A only in S(p2, a) and S(p1, a) respec- 
tively, we have 

From |C,"-'+!| ¢.S(p, e—d) (condition (e)), it follows that 
M.¢S(p,¢— 4), 
that is, the sequence 


(9) My, 


1938] CLOSED SETS IN R" 479 
is uniformly bounded, and the limit superior of (9) is contained in the interior 
of S(p, €). From (9) we choose a subsequence 

(10) M1,, 


of which the limit inferior is non-vacuous. From the theorem of L. Zoretti 
previously referred to, the limit superior of (10) is a continuum M. Since 
each M; is contained in S(p, e—d), M is likewise. From A > M;, and A closed, 
we have 


(11) ADM. 
We affirm 
(12) M-P=0. 


If we assumed the contrary, there would be some P; for which M-P;+0. But 
M,¢|C,"-'+|, and from condition (b) on (6’) we should have 


Pi) > fai >O0, forall s>i. 
This leads to the conclusions 
Pi) 
and therefore M -P;=0, which proves (12). Combining (3) and (11), we have 


M = M-A,. 


i=1 


Moreover 
= 0, 


M - A, is closed. 

We have thus obtained a decomposition of the continuum M into a count- 
able infinity of closed sets whose intersections taken pairwise are vacuous. 
This is impossible according to Sierpifiski’s theorem unless all but one of 
these sets are vacuous. But M must contain a point of A-S(p:, a) (k=1, 2), 
since each of the sets in (9) does. From the choice of the number a, we have 
A;,—P2>A-S(p;,«)—P. M must therefore have a non-vacuous intersection 
with both A;, and A;,. This final contradiction completes the proof of the 
theorem. 


Coro ary. In R" let F’ be a closed set and 2"~"— a cycle in R" —F’ which 
does not bound in R" —F’. Moreover let F’=F{ ---+F/ where 
F is closed (s=1, 2, - - - ) and %"~*-! bounds in R" —F,) . Then there exists a 
pair of integers m and n such that dim F,/ -F,’ =r—1. 


480 A. N. MILGRAM [May 


To show this we let p be any point of F’ and e some positive number. 
R" is homeomorphic to S(p, €), and the homeomorphism f may be so taken 
as to leave invariant. If we take F=f(F’), F,=f(F;') and 
then, since 


F;-F;-S(p, = fFi)- fF i), 
we have from Theorem 3, 
dim f(F;,.) - 2r-—1 
for some pair of integers m and n. But because of the invariance of dimension 
under homeomorphism, this implies 
dim 2=r—1. 


Another very interesting application of Theorem 3, or rather the corollary 
to Theorem 3, is the generalization of a theorem first proved by Miss Mulli- 
kin.t Miss Mullikin showed that the sum of a countable number of closed sets, 
no one of which separates the plane, and whose intersections taken pairwise 
are vacuous, cannot separate the plane. Recalling that the vacuous set is of 
dimension minus one, and the plane is R?, we see that this is a particular case 
of the following theorem: 


THEOREM 4. In R" let Ay’, As’, - - - be a countable collection of closed sets 


no one of which separates R", and such that 


dim A/-Aj/ sn — 3, for 


Under these conditions the sum S=)-7.,A{ cannot separate R". 
Proof. Let us suppose, to the contrary, that S does separate R”. Then 
R®—-S=M,+ M2, 


where M, and M; are non-vacuous and mutually separated. From the well 
known result that if a set X separates a connected set M then some closed 
subset Y of X also separates M,t it follows that S contains a closed subset A 
which separates M, from in R". Setting A;=A-A/, we have 
(A; closed). Taking a; a point of M,; and az a point of M2, we have a;—a.& 0 
in R"—A; but a;—a,non-& 0 in R" —A. It follows from the corollary to Theo- 
rem 3 that there exists a pair of integers m and m such that dim A,,-A,2=n—2. 
Since A/ > A;,dim A,,-A,’ >n—2 contrary to the hypothesis of the theorem. 

We return now to our discussion of the dimension of closed sets. If the 


7 A. M. Mullikin, Certain theorems relating to plane connected point sets, these Transactions, 
vol. 24 (1922), pp. 144-162. 
t See Knaster and Kuratowski, Fundamenta Mathematicae, vol. 2 (1921), pp. 234-235. 


1938] CLOSED SETS IN R” 481 


dimension of the closed set F is r, then F is a simple r-dimensional obstruction 
at a point p of F. We can find an e>0 and a cycle z*-*-' in S(p, €) —F such 
that z"-’-! does not bound in S(p, ¢) —F. Denote by d the distance between 
F and z"~*-". Now let F be decomposed into the sum of a countable number 
of closed sets Fi, F2,---, F,,--- each of which is of diameter smaller 
than d. These sets will satisfy the hypotheses of Theorem 3, if 2*-"-'& 0 in 
S (p €) ~F,. 

If F,-S(p, €) =0, the above is certainly true. Let us assume therefore that 
qg. is a point of F,-S(p, €). 6,.=6(F,) <d. Choose two numbers a and 8 such 
that 6,<a<f<d. S(q., a) > F, and S(p, €) —S(qs, a) >2"-*-!. We may assume 
that r ~0, since the theorem we are about to prove is trivial for the case r=0. 
We now show that z"-*-! can be deformed into a point in S(p, €) —S(q., @), 
from which z"-"-'= 0 in S(p, €) —F, follows. This may be done in two steps. 
First project 2"-*-! on S(p, €)-B(S(q., 8)), with center of projection q,. De- 
note the projection of by Since S(p, €) -B(S(qs, 8)) is either equal 
to B(S(q., 8)) or homeomorphic to a hemisphere of B(S(q., 8)), our second 
step, deforming z77,’~’ into a point is possible (note nm —r —1#n—1). The pro- 
jection can be considered a deformation with the parameter varying from 
0 to 3 as 2"-*-! moves to 21,,"~' and from 3 to 1 in the second step. 

We can therefore say that if a set F is of dimension r, there exists a num- 
ber d such that any decomposition of F into a countable infinity of closed sets 
of diameter less than d has the property that the intersection of at least one 
pair is of dimension at least r—1. Conversely, if a closed set F in R" can be 
e-decomposed into a countable infinity of closed sets with the intersection of 
any pair of dimension at most r—1, and if € is arbitrary, then the dimension 
of F is at most r. From the fact that any closed set of dimension r can be de- 
composed into a countable infinity of closed sets of diameter less than any 
preassigned e, whose intersections, taken pairwise are of dimension at most 
r—1 (if the set is compact, these may even be taken to be finite in number) 
we can state the following theorem: 


THEOREM 5. A closed subset F of R” is of dimension r if and only if there 
exists an €>0, such that F may be decomposed into the sum of a countable infinity 
of closed sets F,,F2,--+-,Fs, - , of diameter less than ewith dim F;-F;<r—1, 
ij, but for any such decomposition there exists a pair of integers m and n such 
that dim F,,:-F,=r—1. 


UNIVERSITY OF PENNSYLVANIA, 
PHILADELPHIA, Pa. 


CONCERNING LIMITING SETS IN ABSTRACT 
SPACES, II* 


BY 
R. G. LUBBEN 


In his first paper on limiting sets} the author considered the distributive 
property in connection with metric spaces. In this paper we consider the 
property in connection with more general spaces and show that it and weak 
additional hypotheses imply that every uncountable point set in the space 
under consideration (1) is a-compact in itselff and (2) is separable. It is well 
known that in a metric space properties (1), (2), and the following, (3), are 
equivalent: (3) Every point set has the Lindeléf property. Sierpifiski has 
shown that (2) and (3) are independent in a space S.§ In consideration of 
Sierpifski’s result an equivalence involving these properties as stated in 
Theorem 7 is of considerable interest and is used in showing that (2) holds 
in Hausdorff space. 

Above we discussed certain properties that hold “im grossen.” With the 
help of the first countability axiom, or a more general hypothesis concerning 


* Presented to the Society, December 27, 1928, August 30, 1929, and September 9, 1931; re- 
ceived by the editors September 8, 1936 and, in revised form, May 14, 1937. 

t These Transactions, vol. 30 (1928), pp. 668-685. In a topological space the limiting set of an 
aggregate G of sets is the set of all points P of the space such that every neighborhood of P contains 
points in common with infinitely many distinct elements of G. The elements of G are understood to 
be sets h(a, g), where a is a number and g is a point set in the space; for the case a=O let h(a, g) be g, 
and for other values of « let the elements of h(a, g) be a and the points of g. Thus, we may refer to 
an element h(a, g) of G as a point set if a=0. A topological space is said to have the distributive prop- 
erty provided that if in that space K is a closed point set, G is a collection of sets, and if each point 
of K belongs to some subset of K which is the limiting set of a sub-collection of G, then K itself is the 
limiting set of a sub-collection of G. 

For well known definitions and for general information about topological spaces see: M. Fréchet, 
Les Espaces Abstraits et leur Théorie Considérée comme Introduction a l’Analyse Générale, 1928; 
F. Hausdorff, (I) Grundsiige der Mengenlehre, 1914, and (II) Mengenlehre, 1927; K. Menger, (I) 
Dimensionstheorie, 1928, and (II) Kurventheorie, 1932; R. L. Moore, Foundations of Point Set Theory, 
American Mathematical Society Colloquium Publications, vol. 13, 1932; C. Kuratowski, Topologie, 
1933; W. Sierpifiski, (I) Introduction to General Topology, 1934, translated by C. C. Krieger; and 
P. Alexandroff and H. Hopf, Topologie, 1935. 

t A space or a point set is a-compact (in itself) provided that every uncountable point set in it 
has a limit point (contains a limit point of itself). Cf. W. Gross, Zur Theorie der Mengen in denen ein 
Distanzbegriff definiert ist, Sitzungsberichte der Kaiserlichen Akademie der Wissenschaften, part IIa, 
vol. 123 (1914), p. 805. 

§ Cf. W. Sierpifiski, (II) Sur l’équivalence de trois propriétés des ensembles abstraits, Fundamenta 
Mathematicae, vol. 2 (1921), pp. 179-188; C. Kuratowski and W. Sierpifiski, Le théoréme de Borel- 
Lebesgue dans la théorie des ensembles abstraits, Fundamenta Mathematicae, vol. 2 (1921), pp. 172-178. 


482 


LIMITING SETS IN ABSTRACT SPACES 483 


monotonic families of neighborhoods* the distributive property gives also 
local compactness and regularity. The author has been unable to prove that 
the Lindeléf property is among these necessary conditions; if it is, it is possi- 
ble to state simple necessary and sufficient conditions for the distributive 
property (see Theorems 14 and 16). 


Lemna I. In a Fréchet space H in order that every point set be separable, 
it is necessary and sufficient that (1) every closed set be separable and (2) if a 
point is a limit point of a point set, it is a limit point of a countable subset of the 
point set. 


Lemna II. In a Fréchet space V every monotonic family of neighborhoods of a 
point contains a sub-collection which is a well-ordered monotonic descending 
family of neighborhoods of the point. 


THEOREM 1. A space satisfies the first countability axiom if each point in it 
has a monotonic family of neighborhoods and one of the following holds: (A) The 
space is a Hausdorff space in which every point set is a-compact in itself; (B) the 
space is a space H in which a point is a limit point of a point set if and only 
if it is a limit point of a countable subset of the point set. 


Proof. Consider first case (A). Let S be the set of all points in the space, 
P bea point in it, H be a well-ordered monotonic descending family of neigh- 
borhoods of P, and K be a well-ordering of the points of S—P. Let U; be an 
element of H, P; be the first point of K in Ui, and V; the first element in H 
which is a subset of U; and of which P, is not a point or a boundary point. 
Suppose that P is not an isolated point of the space. Suppose that U., P., 
and V, have been defined for each ordinal x less than a definite ordinal a. 
Provided that there exist elements of H common to all V,’s for x <a we shall 
define U., P., and V, as follows: U, is the first element of H common to all 
the V,’s for x<a; P, is the first element of K in U.; Vq is the first element of 
H which is a subset of U, and of which P, is not a point or limit point. Let G 
be the well-ordered sequence (Vi, V2, V3,---,Vu,---,Va,-**), Where ais 
such that U,, P., and V, exist;and let E= (Pi, Ps, 
Clearly each point of E is an isolated point of E. It follows from our condition 
that E, and hence G, has each a finite or a countable number of elements. 

We shall show that each element of H contains an element of G. Suppose 
this were not true; then, since H is monotonic, it would contain an element U 

* A complete family of neighborhoods of a point is one that defines the operation of derivation at 
that point; cf. Fréchet, pp. 172-173. Such a family is monotonic provided that every pair of its ele- 
ments has the property that one is a subset of the other, and said to be monotonic descending with ref- 
erence to a definite ordering provided that if one element precedes another, the first contains the 


second. In connection with a space H Fréchet we shall consider the term neighborhood as equivalent 
to the term “open set”; cf. Fréchet, pp. 186-187. 


484 R. G. LUBBEN [May 


which is a subset of all elements of G. There exists a first ordinal \ which is 
greater than all ordinals x such that there exists an element V, of G. Since H 
is a well-ordered sequence, there exists a first one of its elements that is com- 
mon to all elements of G, and this element is by definition U,. Then the first 
point of K in U, is Py. There exist open sets R; and R, containing P, and P 
respectively and having no common points. Then P is not a point or limit 
point of S—U) or of S—R:2; hence, there exists a first element of H that is a 
subset of U, and does not have P, on its boundary. This element is by defini- 
tion V,. But, this is contary to the definition of \. Thus, every element of H 
contains an element of G; the converse is true. Since H is a family of neigh- 
borhoods of P, so is G.t 

Consider next case (B). Let P be a limit point of the set of distinct points, 
E=(P,, P2, Ps,---), none of which is P,and let H=[W] be a family of 
neighborhoods of P. For each n let W,, be an element of H containing no point 
of Pi+P2+P3+ --- +P,,andG=(Wi, We, Ws, - - - ). Then each element of 
H contains an element of G; for, if some element U of H did not contain an ele- 
ment of G, it would be a subset of every element of G. Hence, if m is any in- 
teger, U is a subset of W,,, and does not contain P,. But this involves a con- 
tradiction, since P is a limit point of E. Thus every element of H contains an 
element of G, and conversely. It follows that G is a family of neighborhoods 
of P.t 

THEOREM 2. A locally compact Hausdorff space which has the Lindeléf prop- 
erly satisfies the first countability axiom. 

Proof. Let P be a point of our space T. For each point Q of T—P let Ug 
and Vg be mutually exclusive open sets containing Q and P respectively. Then 
T —P may be covered by a countable sequence (Ug,, Ug:, Ue, - - - ) of the 
elements of [Ug]. Let R be an open set containing P such that R is compact; 
let UV e,); and let F=(Wi, Wo, Ws, - - - ). Then Fis a mono- 
tonic descending family of neighborhoods of P. 

For, let M be a point set having points distinct from P in every element 
of F. It may be shown that M has a subset N =(Pi, Po, P;, - - - ) of distinct 
points such that for P,eW,, each n. Since R> W,, W, is compact; hence N has 
a limit point X which is a point of W:-W2-Ws- - -- . If X were a point of 
T —P, there would be an integer such that X belongs to Ue,. Since W, con- 
tains no point of Ug, we are involved in a contradiction. 

Conversely if P is a limit point of a point set K, every element of F con- 
tains a point of K distinct from P. 


Lemma III. In a space H the limiting set of a collection of point sets is closed. 


t Cf. Fréchet, p. 173. 


1938] LIMITING SETS IN ABSTRACT SPACES 485 


THEOREM 3. Every regular space H which satisfies the first countability axiom 
and has the distributive property is locally compact. 


This theorem may be proved by methods analogous to those used in the 
proof of Theorem 8 of the author’s first paper, p. 677. 


THEOREM 4. Every space V which has the distributive property is a-compact. 


Proof. Suppose there exists a space which satisfies the hypothesis, but 
contains an uncountable point set M whose derived set is vacuous. Let K 
be a countable subset of M, N=M-—K, and P,, Po, P;,--- be points of K. 
For each point x of N and each positive integer m let g..=x+P,. Let G* be 
the aggregate [g.,]. For each point x of N there exists a neighborhood R, 
of x which contains no point of M—x. Hence x is the limiting set of the ag- 
gregate G* =(g., g23, ). Since has no limit point, it is closed. It fol- 
lows from our hypothesis that G* contains a sub-collection G whose limiting 
set is V. Let G.=G-G*. Since R, coritains no point of any element of G—G,, 
G, contains infinitely many distinct elements. Since N is an uncountable point 
set, and an element of G, contains the point z of N only if y=z, G has un- 
countably many elements. Hence there exists an integer m such that P,, is 
common to infinitely many elements of G. This involves a contradiction with 
the fact that WN is the limiting set of G. 


THEOREM 4A. In order that a metric space should have the distributive prop- 
erty, it is necessary and sufficient that it be locally compact and separable. 


This theorem is a consequence of Theorem 4, Gross, loc. cit., pp. 805-806, 
and Theorems 8 and 9 of the authors first paper, pp. 677-678. 

A space S; is said to be a sub-space of a space S2 provided that (1) every 
point of S; is a point of S2, and (2) if P is an arbitrary point and M an arbi- 
trary point set in S; then P is a limit point of M in S, if and only if it is a limit 
point of M in S». 

Lemna IV. Every subspace of a space S is a space S, and every subspace 
of a space H is a space H. 


THEOREM 5. If a space S has the distributive property, then every regular, 
locally compact subspace of it has this property. 


Proof. Let S; be a space S having the distributive property, and let T be a 
regular, locally compact sub-space of it. In T let K be a closed point set and 
G be a collection of sets such that each point P of K belongs to a subset Kp 
of K which is the limiting set of a sub-collction Gp of G. Let Lp be the limit- 
ing set of Gp with respect to the space S,. Then Kp=Lp-M, where M is the 
set of all points belonging to T. Let N be the sum of all point sets Lp, where 


486 R. G. LUBBEN [May 


the range of P is K; N’ be the derived set of N with reference to the space S;; 
and N=N +N’. It follows from the definition of N that N- M=K. Suppose 
that NV’: M contains a point Q which does not belong to K. Since K is a closed 
point set with respect to 7, Q is not a limit point of K in either T or S$). 
Hence, there exists in JT an open set R; containing Q such that if Rir denotes 
the sum of R, and its limit points in 7, then K- Rir is vacuous. Since T is 
locally compact, there exists in it an open set R; containing Q such that Rer is 
compact. Let R;=R,-R:. Then R; is an open set in T and contains Q. Also 
Rir> Rsr. It follows that Q is not a limit point of M—R; in the space Si. 
Hence, there exists in S; an open set U which contains Q but contains no point 
of M — R;. Since Q is a limit point of NV, there exists in K a point x such that U 
contains a point y of L,. Then U must contain points of infinitely many ele- 
ments of G,. Since the elements of G, are subsets of M, and U- M isa subset 
of R;, R; contains points of infinitely many elements of G,. Since Rr is com- 
pact, it must contain a point W which belongs to the limiting set of G,; since 
Rir > Rsr, the latter contains no point of K. This involves a contradiction 
with the fact that K > K,. Hence the point Q does not exist, and V-M=K. 

For each point x of N—K let Miz, hoz, Its2,--- be the pairs (x, 1), (x, 2), 
(x, 3),---. Let H be the aggregate [h;.], where the range of i is the set of 
positive integers, and-that of x is the point set W—K. Since S; is a space S, 
N is closed, and for every point P of K the following holds: V2 N > Lp. It 
follows that each point of NW belongs to a subset of W which is the limiting set 
in S; of a sub-collection of G+H. Since S, has the distributive property, 
G+H has a sub-collection G; +H, such that is the limiting set of Gi in 
S, and such that G, and H, respectively are sub-collections of G and H 
respectively. 

Suppose that K is not a subset of Ki, where K; is the limiting set of G, 
in S;. Then K must contain a point E which belongs to the limiting set of H:. 
Let Re be an open set in T containing E such that Rer is compact. It follows 
from an analogous situation above that in S, the point £ is not a limit point 
of M—Rz and that there exists in S$, an open set Ug which contains no point 
of M—Rg. Then Us contains a point X of an element of H;. Let B denote the 
limiting set of G in S;. Since B> N, and B is closed, B> N. Since X is a point 
of N—K=N—VN-M, and all elements of Gare sub-sets of M, X is the unique 
limit point in S; of an infinite subset of M, and Uz, which contains X, con- 
tains such a set Ex. Then Re>M-Uzg> Ex. Since X is not a point of T, 
Ex has no limit point in T. This involves a contradiction with the fact that 
Rer is a compact subset of 7. Hence Ki> K. Since M>K, it follows that 
M-K,2K. 

Since N2 Ki, K=N-M2>K,-M. From K2K,-M and it fol- 


1938] LIMITING SETS IN ABSTRACT SPACES 487 


lows that K = K,-M. Hence K is the limiting set of G, in T. Thus we have 
shown that T has the distributive property. 

A space H is said to be nearly a space L provided that if in that space P 
is any point, M is any point set, and FP is a limit point of M, then P is the 
derived set of a subset of M. 


THEOREM SA. Every regular, locally compact subspace of a space H which 
has the distributive property and is nearly a space L has the distributive property. 


The proof is the same as that for Theorem 5. 


THEOREM 6. In a space H which has the distributive property every point 
set is a-compact in itself. 


Proof. Suppose the theorem is not true and that S, is a space H which 
has the distributive property but contains an uncountable point set M which 
contains no limit point of itself. Let T be the subspace of S; whose points 
are the points of M. To show that T has the distributive property adopt the 
notation of the proof of Theorem 5 and follow this proof to the place where 
the existence of the collections G, and H,; is established, and suppose as there 
that K contains a point E not belonging to the limiting set of G:. Define M’ 
as the derived set of M in S;. Since the points of elements of H; are points of 
N-—K, it follows that E is a limit point of M’—M-M’ in S,. Since S, is a 
space H, derived sets in it are closed, and E is a point of M’ and hence a limit 
point of M. Since E£ is a limit point of M, we are involved in a contradiction. 
Thus E does not exist, and the argument of Theorem 5 shows that T has the 
distributive property. 

By Theorem 4 the set M must have a limit point in 7. But this again is 
contrary to the definition of M. Thus the supposition that the theorem is not 
true leads to a contradiction. 

Note. When the space of our hypothesis is a space S the proof may be 
simplified. Let S, be our space and define M and T as above; then T is a 
regular, locally compact subspace of Si, since all its points are isolated. By 
Theorem 5, T has the distributive property, and by Theorem 4, we are in- 
volved in a contradiction. 

THEOREM 7. In order that in a space H each point set either be condensed in 
itself or be separable, it is necessary and sufficient that every point set be a-com- 
pact in itself.¢ 

Proof. Obviously the condition is necessary. Suppose that it is not suffi- 
cient, and that the space contains a point set E which neither contains a con- 


t In part our proof of Theorem 7 follows methods used by Sierpifski; cf. Sierpifiski (II). See 
also the introduction for a discussion of the relation of Theorem 7 to some results by Sierpifski and 
Kuratowski. 


488 R. G. LUBBEN [May 


densation point of itself nor is separable. Then, for each point P of E there 
exists a countable subset D(P) of E such that P is not a point or a limit point 
of E—D(P). Let T be a well-ordered sequence (1, ps, , Po, Port, 
pa, * ++) of the points of E. We shall now define a sequence of the type 4, 
where 4 is the smallest transfinite ordinal of the third class; U = (qu, ge, gs, - - - 

Proceed as follows: Let g:= 1. Let 8 be a definite ordinal less than 6. Sup- 
pose that g, has been defined for all ordinals x <8, and let Ug be the set of 
all g.’s for such x’s. Let Sg be the sum of all point sets D(q,), where g, is an 
element of Us. Let gs be the first point of T which is not a point or a limit 
point of Sz. 

We shall now show that gg exists for every ordinal 6 less than 6. For, if gg 
does not exist for all such ordinals 8, there must be a first such ordinal A 
for which it does not exist. Then it follows from our definitions that each 
point of £ is either a point or a limit point of S,. But S, is the sum of all point 
sets D(q.), where g, ranges over U); thus S, is the sum of a countable number 
of countable sets and is countable; then E is separable. Thus, the supposition 
that d exists leads to a contradiction. 

Next we shall show that gg is an isolated point of U. By definition gz is 
not a point or a limit point of Us. Further, Ss,:, which contains D(qg), con- 
tains no point of U —U 4,41. Since U —gg=Us+(U — U 441), it follows that gg is 
an isolated point of U. 

Thus, every point of the uncountable sequence U is an isolated point of U. 
By our hypothesis, however, U must contain a limit point of itself. Thus, the 
supposition that our condition is not sufficient has led to a contradiction. 


THEOREM 8. Jn order that for each infinite collection of point sets in a space H 
it be true that at most a countable number of its elements fail to be subsets of its 
limiting set, it is necessary and sufficient that every point set in the space be 
a-com pact in itself. 

Proof. We shall first show that the condition is sufficient. Suppose that 
it is not and that there exists in our space a point set K and a collection G 
of point sets such that K is the limiting set of G, but that G contains an un- 
countable sub-collection G,, none of whose elements are subsets of K. For 
each element g of G,; let P, be a point which does not belong to K. It follows 
by our hypothesis that the set [P,| contains a point Q, every neighborhood 
of which contains infinitely many elements of [P,]. Then Q belongs to the 
limiting set of G, and we are involved in a contradiction. 

Conversely, let M be an uncountable point set in a space which satisfies 
our condition. Let G2 be a collection of point sets whose elements are the 


1938] LIMITING SETS IN ABSTRACT SPACES 489 


points of M, no two elements being the same point. Then M’ is the limiting 
set of G, and M’ and M have uncountably many points in common. Thus, the 
condition is necessary. 

Theorems 8 and 9 are generalizations of Theorems 2 and 4, respectively, 
of our first paper, and are of interest in connection with Theorem 7, and also 
with Theorems 6 and 10A, in that they indicate consequences of the distribu- 
tive property. 

THEOREM 9. In order that for a space H every infinite collection of sets should 
contain a countable sub-collection having the same limiting set as the collection 
itself, it is necessary and sufficient that every point set in the space be separable. 


Proof. To prove the necessity of the condition proceed as follows: Let 
N be a point set and M = N. Let G be a collection such that for each point x 
of N there exists a collection of elements of G, giz, ger, gsr, - - , Where gnz is 
the pair (, x). Now proceed by methods analogous to those used in the proof 
of Theorem 4 of the author’s first paper. 

Consider next the sufficiency. Let G be a collection of sets and K be 
the limiting set of G. By our condition K contains a countable subset 
N=P,+P2+P;+--- such that W=K. Then for each n the point P,, be- 
longs to the limiting set of some countable sub-collection of G. Suppose that for 
some definite P;=Q this is not true. Let go be a definite element of G. 
For nm greater than zero suppose that g; has been defined for k<n. Let 
Gn=G—(gotgitget --- +£n-1); let H, be the sum of all elements of G,; 
and let F, =Qi:+Q2+03;+ --- beacountable set such that F, > H,, > F,. Then 
Q is a point of F,’. 

For, let R be an open set containing Q. Since Q belongs to the limiting set 
of G, and hence of G,, R contains points of infinitely many elements of G,. 
If Q were common to infinitely many elements of G,, it would be common to 
a countable infinity of such elements and would belong to the limiting set 
of this countable collection; this, however, is contrary to the definition of Q. 
Thus R contains points of H, distinct from Q, and hence points of F,. Thus 
QeF,’. For each positive integer k& let ¢, denote a definite element of G, that 
contains Q,. Let T denote the aggregate (h, - - - ). Since T has a finite 
or a countable number of elements, Q does not belong to its limiting set. 
Hence, there must exist an open set U containing Q which contains points of 
at most a finite number of the elements of 7, say of &,, t,,--- , &;. Then 
U-F, =) {21U -F,.-t,. Since Q belongs to the derived set of U-F,, it follows 
that for some i, 0<i<j+1, Q is a limit point of U-F,-t,,, that is of 4,. Let 
g, be defined as the sum of m and such a &,, and E=(gi, ge, gs, ). Then Q 
is a limit point of each element of Z,and hence belongs to the limiting set of E. 


490 R. G. LUBBEN [May 


Thus, each point P, of N is a point of the limiting set of some countable 
sub-collection M, of G. Let M=)°'-*M,, and let L be the limiting set of M. 
Then K = N>L2N. Since in a space H derived sets are closed, it follows that 
L=N=K. But, M isa countable sub-collection of G. 


THEOREM 10. In a Hausdorff space having the distributive property every 
closed point set is separable. 


Proof. Suppose that a space S; satisfies the hypothesis but not the con- 
clusion of our theorem. Then there exists in it an uncountable, closed, non- 
separable point set E. By Theorems 6 and 7 the set E contains a point of 
condensation of itself Q; similarly E—Q contains a point of condensation of 
itself P. Let U and V be mutually exclusive open sets containing Q and P, 
respectively. Let M=U-E, N=V-E, and H=[S,—(U+V)]-E. Then one of 
the three point sets M, N, or H is non-separable; for otherwise EZ, their sum, 
would be separable. Consider the two cases: (I) Either M or N is not separa- 
ble; (II) both M and N are separable. Consider first case (I) and suppose that 
it is N that is not separable. Let K=E-—M; then H+N=E-—M2>Ko2N, 
and K is not separable. For, suppose that K is separable and has a countable 
subset K, such that K,> K. By definition H is the product of the two closed 
sets E and S,—(U+V), and thus is closed. It follows that every point of 
N=K-—K-H is either a point or a limit point of V- K,. Thus, the supposition 
that K is separable, involves a contradiction with the hypothesis that N is not 
separable. 

We shall now define certain sequences by an induction process. Let Z9 be 
a point of M—Q, Ro=20, and U»=U. Now suppose that U;, Rx, and z;, have 
been defined for all non-negative integers k less than the definite integer n. 
Let U, be an open set containing Q such that Un»1— Ry1-Un_1 > Un, let Zn 
be a point of M-(U,—Q), and let R, be an open set containing z, such that 
Un,—Q2R,. Let F=21+22+23+ ---. The existence of U,, zn, and R, for 
every positive integer » may be shown by making use, in particular, of 
Hausdorff’s Axiom D. Since the open sets Ri, Re, R3,--- are mutually ex- 
clusive, it follows that F-F’ is vacuous. 

For each point ¢ of K and each positive integer m let g..=t+z,, and G 
be the aggregate of all such g;,,’s. Since 21, 22, 23,--- are distinct points, the 
limiting set of the aggregate (gu, gi, gis, --- ) is ¢+F’. Thus, each point of 
K+F’ belongs to a subset of K+F’ which is the limiting set of a sub-collec- 
tion of G. Hence, G has a sub-collection G, whose limiting set is K+F’. Let W 
be the product of K and the sum of the elements of G;. Since E— M = K, and 
M > F, it follows that K - F is vacuous. Then every point of K is a point or a 
limit point of W, and so is every point of K. If the elements of G, were count- 


1938] LIMITING SETS IN ABSTRACT SPACES 491 


able, so would be the points of W; then K would be separable. This is impos- 
sible, since K is not separable, and no point of K is a limit point of K K. 
Hence, there are uncountably many elements of G;, and there must exist an 
integer j such that z; belongs to uncountably many elements of G;. Then 2; be- 
longs to the limiting set of G,, that is to K+F’. But, z; belongs to neither K 
nor F’. Thus, case (I) involves a contradiction. 

Consider next case (II). Since both M and N are separable, so are M and 
WN. Let K=E—(M+YN). Then K is not separable. Define first F and then G 
precisely as in the proof of case (I); by following this proof we again arrive at 
a contradiction. Thus, the supposition that the theorem is not true is unten- 
able. 


THEOREM 10A. If a Hausdorff space is nearly a space L and has the distribu- 
live property, every point set in it is separable. 


This is a consequence of Theorems 10 and 6 and Lemma I. 


THEOREM 11. A space H which satisfies the first countability axiom is a 
Hausdorff space if and only if it is a space S.¢ 


THEOREM 12. A locally compact space S (Hausdorff space) which satisfies 
the first countability axiom is regular.t 


THEOREM 124A. If a Hausdorff space is locally compact at one of its points P, 
and P has a countable family of neighborhoods, the space is regular at P. 


THEOREM 13. A space S (Hausdorff space) which has the distributive prop- 
erty and satisfies the first countability axiom is regular. 


Proof. Suppose that S; is a space S which satisfies the hypothesis of the 
theorem but contains a point P at which it is not regular. Then there exists in 
S; an open set R containing P such that if R; is any open set whatever contain- 
ing P, then R; is not a subset of R. Let (Vi, V2, Vs, - - - ) bea countable family 
of neighborhoods containing P such that ---.Let U:=Vi, 
m,=1, and M, be a countable subset of Vi— P which has a unique limit point 
in .R’—R, say P;. By an induction process we shall now define Un, Mn, mn, 
and P,, for every positive integer m. Proceed as follows: Suppose they have 
been defined for all n’s less than a definite integer k. Let m;, be the first in- 
teger greater than m,_, such that V,, contains no point of >-37{~'M,, let 
U;.=Vm,, and let M; be a compact countable sequence of points belonging to 
U,—P and having a unique limit point P; belonging to R’— R. 


t Cf. Hausdorff (I), pp. 263-265, and Fréchet, Démonstration de quelques propriétés des ensembles 
abstraits, American Journal of Mathematics, vol. 50 (1928), p. 65. 

t Cf. Alexandroff and Urysohn, Mémoire sur les espaces topologique compacts, Verhandelingen, 
Koninklyjke Akademie van Wetenschappen, Amsterdam, vol. 14 (1929), pp. 28-29. 


492 R. G. LUBBEN [May 


We shall now prove that U;., My, m,, and P;, exist for all positive integers 
k; the argument suggests, in particular, the proof for the case k=1. Suppose 
that our proposition has been established for all m’s less than k, where 1<k. 
Then each of the sequences M,, M2, M3,---, Mix has exactly one limit 
point; the derived set of their sum is Pi+P2+ - - - +P:-1, which is a subset 
of R’—R. Then P is not a limit point of the closed point set }-)={~'M;, and 
there exists an integer m, greater than m,_, such that V,,, contains no point 
of this point set. Hence U;, exists. It follows from the definition of R that 
U; is not a subset of R, and that U; has a limit point P; in R’—R. Since our 
space is a space S, U,—P has a countable subset M; such that P; is the 
unique limit point of every infinite subset of M;. Our existence theorem may 
thus be established by mathematical induction. 

The sequence P,, P2, P3, - - contains a sub-sequence P,,, Pn, Pn, , 
having not more than one limit point, such that 1,<m2<n;< - - - . If this 
sub-sequence has a limiting set, let it be denoted by the symbol Q; otherwise, 
let Q be the null set. Let P,,=Q:; Mn,=Nz; let Ou, Ox, Ox, --- be the 
points of Ni; gx =Q+Ou+O,; let G, be the sequence (gir, Sar» 
Gt G;;and K =Q+Ni+>1-; Ox. The limiting set of G; isQ+Onu+(Qx. 
Since K is closed and the space has the distributive property, G* contains a 
sub-collection G whose limiting set is K. Suppose that G contains elements 
in common with at most a finite number of the elements of the aggregates 
G,, G2, G3, - - - , say with those having subscripts not greater than a definite 
integer ¢. Then, contrary to the fact that K contains infinitely many distinct 
points, the limiting set of G is a subset of 0+)-37\(Ou+0Q:). Hence, for in- 
finitely many values of k there exist elements of G which contain points of N;. 
Thus, every element of (Vi, V2, V3, ---) contains points in common with 
infinitely many distinct elements of G, and P belongs to the limiting set of G. 
Thus, the supposition that our space is not regular has led to a contradiction. 


THEOREM 13A. If a space S has the distributive property and a point P in it 
has a countable family of neighborhoods, the space is regular at P. 


This theorem may be proved by the methods used for Theorem 13. It fol- 
lows by Theorem 3 that the space is locally compact at P. 

Note. The statements of Theorems 12 and 13 differ only in that in the 
hypothesis of the one the distributive property takes the place of local com- 
pactness in that of the other; they are stated for both spaces S and Hausdorff 
spaces. Theorems 12A and 13A are analogous generalizations of Theorems 12 
and 13 respectively; but 12A is stated for a Hausdorff space, while 13A is 
stated for a space S. The question arises as to whether each of the Theorems 
12A and 13A hold for both types of spaces. The author has not found the 


1938] LIMITING SETS IN ABSTRACT SPACES 493 


answer for the case of 13A; for 12A the answer is in the negative, as may be 
seen by considering a space 7, when points and limit points are those of 
P+K+) 1-7; of the proof of Theorem 13. 


THEOREM 14. If a Hausdorff space or a space S has the distributive property 
and each point in it has a monotonic family of neighborhoods, then the following 
properties hold for the space: (1) It satisfies the first countability axiom; (2) it is 
both a Hausdorff space and a space S; (3) every point set in it is separable; and 
(4) it is regular and locally compact. 


Proof. By Theorems 6 and 1 our space satisfies the first countability 
axiom; by Theorem 11 it is both a space S and a Hausdorff space; by Theorem 
10 and Lemma I every point set in it is separable; by Theorem 13 it is regu- 
lar; and by Theorem 3 it is locally compact. 

Note. Theorems 14 and 15 may be regarded as a summary of results of 
this paper with regard to conditions necessary for the distributive property. 
Theorem 16 deals with sufficient conditions; ne 15 and 17 with neces- 
sary and sufficient conditions. 


TueoremM 15. If a space S (Hausdorff space) satisfies the first countability 
axiom and has the distributive property, then in order that one of its sub-s paces 
have the distributive property it is necessary and sufficient that the sub-space be 
regular and locally compact. 


Proof. The necessity follows from Theorem 14; the sufficiency from Theo- 
rem 5. 


THEOREM 16. A sufficient condition that a Hausdorff space have the distribu- 
tive property is that it be locally compact and have the Lindeléf property, and 
that every closed point set in it be separable. 


Proof. By Theorem 2 our space satisfies the first countability axiom; by 
Theorem 12 it is regular; by Theorem 11 and Lemma I every point set in it 
is separable. The proof may be completed by following the methods for Theo- 
rem 9 of the author’s first paper, p. 678. Theorem 17 follows from Theorems 
14 and 16. 


THEOREM 17. In order that a Hausdorff space which has the Lindeléf prop- 
erty and in which every point has a monotonic family of neighborhoods should 
have the distributive property, it is necessary and sufficient that the space be lo- 
cally compact and that every closed set in it be separable. 


UNIVERSITY OF TEXAS, 
AusTIN, TEXAS 


| 


A CORRECTION TO THE PAPER “ON EFFECTIVE 
SETS OF POINTS IN RELATION TO 
INTEGRAL FUNCTIONS’* 


BY 
V. GANAPATHY IYER 


An additional hypothesis is necessary for the truth of Lemma 3. In the rela- 
tion g’(z,) =P! (2n)Qn(Zn), in order that Lemma 3 may be true it is necessary 
to prove that lim,...(log| P,’ (zn)|/|z,|*)=0. Under the conditions stated 
in the Lemma we can prove only that lim sup,..(log|P,! (zn) | /|zn|°) <0. If 
we assume also that lim inf, ...(log un/|Zn|*) 20, where Zn , it 
will follow that lim inf, ...(log| P,’ (zn)|/|zn|*) 20, hence Lemma 3 will hold. 
This additional condition is obviously satisfied in the particular case worked 
out in Lemma 4 since the circles are non-overlapping after a certain stage 
and each circle contains only one zero. If u, is sufficiently small, Lemma 3 
will cease to be true, hence one of the doubtful points raised in §3.6 is an- 
swered in the negative. 


* Received by the editors December 6, 1937. Cf. these Transactions, vol. 42 (1937), pp. 358-365. 


Mapras UNIVERSITY, 
Mapras, S. INDIA 


494 


i 


